MATLAB Answers

2

RandStream generators & dieharder validation suite?

Asked by Brad Stiritz on 13 Jan 2013
Accepted Answer by Jan
Hi,
Has anyone investigated the various MATLAB RandStream generator algorithms (see table in middle of page) using the dieharder validation suite? I'm seeking to generate large numbers (in the millions) of exceptionally uniformly random values & would appreciate seeing the detailed tabular output that dieharder provides.
I'm not super-handy with Visual Studio, but can pull off the basics. I would like to build the dieharder project myself, but as far as I can tell it seems to be targeted towards linux. Is this correct? Has anyone successfully modified the project for Visual Studio? If so, do you have general results you could share please about how any of the MATLAB random number generators fare?
I understand that my question relates in large part to a 3rd party product (Visual Studio) & thus isn't eligible for help from MATLAB support. Additionally, perhaps only a small fraction of MATLAB users are experienced with linux, Windows, & Visual Studio.
Obviously, there are other external venues where I can pose this same question. However, given the extensive MATLAB support for random number generation & numerous discussion threads about RNG, I thought I'd give my question a shot here first, hoping to find an interested & expert cross-platform compiler user.
Knowing that offering something especially real in return for help often gets faster results, I'm up for negotiating (via private message) a fair Paypal fee if s/o can provide step-by-step instructions for how to get the latest dieharder distribution to build in VS 2010. Of course, freely-offered help is always appreciated, but not expected in this case.
Due to the complexity of the issue & potential for a confusing thread, I would like to ask please that you make sure to verify a successful build at your end before posting any super-lengthy Answers.
However I get this answered, either here or via some external site, & whether by a helpful altruist or a needy graduate student, I will make sure all relevant information is posted here: a complete step-by-step solution & my test results for 2-3 chosen RandStream generators.
Thanks, Brad

  3 Comments

Jan
on 13 Jan 2013
The nature of a public forum is to share solutions with the community. A discussion is a part of finding a solution, therefore I would not see this a cluttering. The offering of a fee does not encourage me to answer and I do not think that this should happen more often.
Hi Jan,
Thanks for your comments. I'm sorry to have offended your sense of what is an appropriate posting on the MATLAB Answers site. I have gone back & extensively edited my Question, to hopefully make you feel better about seeing it on the site, as well as to clarify that all results of general interest will be posted.
Regarding your criticism of my offer of fee-for-service: as far as I can tell, this site does not prohibit offering renumeration for assistance. I don't think I have been crass about it & I have tried to show sensitivity in this regard.
Brad
Jan
on 14 Jan 2013
Dear Brad, without doubt you were very clear about your intention to share the results. You have neither been offending not rude. Offering a fee is polite, legal and fair. I do not not want to discourage you to pay anybody who assists you to solve your problem. Therefore I have no reasons to criticize the contents or tone of your question.
I got too many personal messages of cheaters, who offered some dollars for solving their homework. In opposite to this your question has a obviously a serious background. But the public appearance of money can have a bad influence to a forum, which lives from voluntary contributors. Therefore I've written, that I personally do not want this to happen more often, but not, that it should not happen at all.
In another Matlab forum there is a specific category for payed programming or assistance jobs. Unfortunately in this category about 20% of the threads must be deleted, because they violate the forum policies.
I hope my opinion got clearer now.

Sign in to comment.

3 Answers

Answer by Jan
on 13 Jan 2013
Edited by Jan
on 13 Jan 2013
 Accepted Answer

Asking your favorite search engine would reveal some useful instructions in the net, e.g.:
Reading the instructions in the 2nd link are important: While compiling and running DIEHARDER is more or less easy, interpreting the results is very hard science. As long as all pseudo-random-number-generators are deterministic, tests like DIEHARD and DIEHARDER can check the entropy level only.
If you need good random numbers, true random numbers are strongly recommended:
The underlying service at www.random.org is limited, see quota. Therefore getting "millions" of numbers might be either take some time (days!), or you must pay for it. Another idea is to inflate the true random numbers by using them as frequently changing seeds for your pseudo-RNG. But in this case, testing the results by DIEHARDER is a good idea again.
Creating a true-RNG hardware at home is not very hard: One idea was to let an USB camera record a lava lamp and build differences between subsequent images to obtain random bits caused by noise. In a further step you can even omit the lava lamp and use a camera which create more noisy output for darker images and stick a black sheet of paper in front of the lens. Much more detailed instructions can be found by an internet research again.

  3 Comments

Hi Jan,
Thank you for your effort in researching & posting basic information on my query, which I'm sure may be helpful to others here. Regarding the first two links you provided ("useful instructions"), I'm sorry but I'm not interested in using cygwin; & Prof. Brown's PDF does not refer to Visual Studio, nor indeed to the build process at all (as it's documented elsewhere on his web page).
I'm familiar with random.org & the general issue of physical-vs-pseudo random numbers, but your point about measurement of entropy is well-taken, thanks.
Regretfully, I can't accept your Answer. I'm still holding out for VS support, as originally requested ;)
Jan
on 14 Jan 2013
As you found out already, migrating the DIEHARDER suite to MSVC is not trivial. Installing cygwin or even Linux would be easier and it has been tested already. The same matters DIEHARD and TESTU01 also. So of course this answer cannot be accepted, but perhaps it motivates you to keep alternatives in mind.
After lengthy & convincing discussion with Jan (see comments below), I realized that he's right: porting the dieharder project to MSVC would be a very poor use of time, especially given the limited & occasional dieharder use I imagine for myself.
Jan's Answer is to learn the basics of Ubuntu Linux & then run the dieharder binary directly.
I will create test data in Windows & copy to a USB flash drive for access under Ubuntu.

Sign in to comment.


Answer by Peter Perkins
on 14 Jan 2013

Brad, if your goal is to run Dieharder, I can't help.
If your goal is to verify that the generators in MATLAB pass stringent tests of randomness, then you'll find that L'Ecuyer and Simard published a paper a few years back that includes results for their TestU01 suite on a wide variety of generators, including mt19937ar, mrg32k3a, and mlfg6331, the recommended current generators in MATLAB.
Hope this helps.

  6 Comments

Jan
on 15 Jan 2013
Thanks for you answer, Brad. I can reconsider the arguments.
You can try this: Boot the machine from a Linux live CD, e.g. an Ubuntu. Ask a student how to install the DIEHARDER rpm. As far as I can see, you get pre-compiled binaries directly, otherwise a compilation is straight. Run DIEHARDER. I estimate this will cost you 10 minutes (again: 20 in real life).
Migrating DIEHARDER successfully and reliably to MSVC will costs at least one week, this means two weeks in the real life. The compilers have a lot of tiny but evil differences - in opposite to Matlab driven on different platforms. Therefore I do not think that it is only a problem of creating an MSVC project from a make file, but the code itself will require modifications and an exhaustive testing and validation afterwards. Because any software above a certain size contains bugs, any changes might reveal some. Afterwards it is questionable, if fixing the bug causes other problems far away from the concerned code lines. It is extremely time-consuming to investigate this exhaustively and this is, in my opinion, the reason, why there is no description in the net how to compile DIEHARDER in MSVC. Linux booted from a live CD or installed in a virtual machine is much cheaper (with respect to your time-commitment) and more reliable.
On the other hand I do not want to support that you are working to death. Perhaps your decision to port it to MSVC has a much deeper sense, because it takes weeks, not although.
Hi Jan,
Thanks for your considered estimates & very constructive recommendations. Based on your specific step-by-step instructions for doing this on Linux, I will now accept your answer as a "better alternative" to what I asked for, because the logic of what you're saying is finally so completely obvious, even I can see it ;)
Thanks also for your concern about my state of mind & work-life balance. Please don't worry, things are going pretty well for me. Even though I do work 7 days per week, it's generally on my own schedule. I take frequent breaks & spend a good amount of time enjoying life. I do greatly enjoy coding & analysis though & feel very fortunate that I've found a great niche for myself where I can be productive & happy.
As far as this particular mini-project, I'm finally "connecting the dots" & understanding how easy this will be for me : simply find s/o who's comfortable & competent in Linux & just provide him/her the instructions & the data to test. Wow, brilliant! Thanks, Jan :)
I'll report back with results in the next several weeks, as it all comes together..
Brad
Jan,
I forgot to ask: would you mind please editing your original Answer or submitting a new one, with your rationale that a better idea is to run the dieharder distribution as-is under Ubuntu..? This way, readers can see straightaway what the discussion conclusion was. Alternatively, or in addition, do you think I should add an "Update" section to my Question, briefly summarizing our discussion?
Thanks, Brad

Sign in to comment.


Answer by Jan Pospisil on 25 Feb 2013

One of my student tested the generators in his bachelor thesis, he used the generators in Matlab as well as the true random generator /dev/urandom in linux systems, then he exported the numbers for dieharder and run all the dieharder tests. If you are interested, I can send you the PDF.

  0 Comments

Sign in to comment.