tag:blogger.com,1999:blog-37324607.post6826771544297334630..comments2023-11-03T08:31:23.698-04:00Comments on Data Mining in MATLAB: Basic Summary Statistics in MATLABWill Dwinnellhttp://www.blogger.com/profile/03379859054257561952noreply@blogger.comBlogger3125tag:blogger.com,1999:blog-37324607.post-74838111699806702452009-06-08T15:39:46.057-04:002009-06-08T15:39:46.057-04:00hi
I just found your blog as well and i wanted to ...hi<br />I just found your blog as well and i wanted to say thank you for all the hard work! the grouping part is a life saved (before hand i had a loop running over 1M records... you can tell why i gave up on it).Flyinghttps://www.blogger.com/profile/04238021875073405127noreply@blogger.comtag:blogger.com,1999:blog-37324607.post-50076493522869935392009-01-02T16:55:00.000-05:002009-01-02T16:55:00.000-05:00Hi,Thank you for your presentation. It's very usef...Hi,<BR/>Thank you for your presentation. It's very useful and creative!<BR/>Would you please tell me which formula is for what purpose regarding standard deviation as there are two: one with 'n' and the other with 'n-1' in the denominator.<BR/>Bhoj R ShresthaAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-37324607.post-56700222309932783042008-09-25T09:28:00.000-04:002008-09-25T09:28:00.000-04:00Hi,I have just found your blog, and I find it very...Hi,<BR/><BR/>I have just found your blog, and I find it very interesting and useful. <BR/><BR/>Currently I'm using MATLAB 2008a for my thesis, looking at a lot of data. Thus speed is one of my primary concerns...<BR/><BR/>You have said:<BR/>"Note that the convention in MATLAB is for variables to be stored in columns, and observations to be stored in rows. This is not a hard-and-fast rule, but it is much more common than the alternative (variables in rows, observations in columns). Besides, most MATLAB routines (whether from the MathWorks or elsewhere) assume this convention."<BR/><BR/>This is true, and indeed (to my surprise) it is faster to sum through the rows:<BR/>>> x = randn(10000);<BR/>>> tic; sum(x,1); toc;<BR/>Elapsed time is 0.212993 seconds.<BR/>>> tic; sum(x,2); toc;<BR/>Elapsed time is 0.171381 seconds.<BR/><BR/>On the other hand, as far as I know MATLAB is one of the few languages, that store arrays in column order. Thus reaching columns of a 2D array should be faster, because of less caching activity. So now I'm confused...<BR/><BR/>I know, that better structured code is more important than few percent in execution time. Nevertheless I'm interested...<BR/><BR/>I'd be happy for any comments regarding this...<BR/><BR/>Thanks,<BR/>MarkMarkhttps://www.blogger.com/profile/09015114465783630364noreply@blogger.com