Okay, but are you declining to explain the difference between standard ODBC / JDBC / .Net Provider interfaces and the one you used
I don't need to regurgitate already available documentation, please use google for more insight. Bulk load of data differs from one DBMS to another, so its best to read how they implemented it under the covers. But bulk data loads is the closest thing to what you are doing with DDS and RPG concept.
I believe that remains to be seen. I for one don't fully understand the API you used nor how it might apply to Big Data. Say you have devices that are recording measurements and feeding unstructured data to a computer and > > you want to store it in a database for future analysis. Would the API you referenced be the right tool for the job?
Again read up, so you can understand the API. I merely replicated your example of "I can insert data really fast using DDS and RPG" which you yourself related to somehow being relevant to "big data". So now we can both say we can insert data really fast. Except I can say that I can do it regardless of the database I'm using and can hand that code off to the hordes of .NET programmers available in the world for maintenance/upkeep for future generations.
Does loadTest.Flush() queue work that runs asynchronously in a separate thread, then returns (like submitting a batch job)?
Nope, its synchronous. I'm pushing 1 million records at a time in a loop (thought that was evident from the for statement, I probably could have did from 1 to 130, instead of 1 million to 130 million in steps of 1 million). The full source code is available, you can compile and step through the code at your leisure if you want to learn more.
What would happen if you ran an SQL Select count(*) from LoadTest immediately following your "for" loop? Would it return 130,000,000?
Yep, it returns 130 million (and chews up a lot of disk space!).
-----Original Message-----
From: Nathan Andelin [mailto:nandelin@xxxxxxxxx]
Sent: Friday, September 06, 2013 2:55 PM
To: Midrange Systems Technical Discussion
Subject: Re: How to handle Big Data under IBM i?
From: Matt Olson
Use the right API and you'll be golden.
Okay, but are you declining to explain the difference between standard ODBC / JDBC / .Net Provider interfaces and the one you used?
Simple answer here is your using the wrong tool for the job.
I believe that remains to be seen. I for one don't fully understand the API you used nor how it might apply to Big Data. Say you have devices that are recording measurements and feeding unstructured data to a computer and you want to store it in a database for future analysis. Would the API you referenced be the right tool for the job?
I sure bet you were surprised when I could show how you can get the
same type of performance in a more versatile way (cross database
capable) and that this functionality is not something inherently
unique to IBM i.
Surprised? Well, puzzled would be more accurate. I began this journey by reviewing the documentation and studying the architecture of memSQL and VoltDB, so I'm not "surprised" by high performance. But in this case, I don't fully understand the interface used in your benchmark.
Consider the following clip of your code:
1. for (int i = 1000000; i <= 130000000; i += 1000000)
2. {
3. loadTest.Flush();
4. }
5.
Does loadTest.Flush() queue work that runs asynchronously in a separate thread, then returns (like submitting a batch job)?
What would happen if you ran an SQL Select count(*) from LoadTest immediately following your "for" loop? Would it return 130,000,000?
-Nathan.
--
This is the Midrange Systems Technical Discussion (MIDRANGE-L) mailing list To post a message email: MIDRANGE-L@xxxxxxxxxxxx To subscribe, unsubscribe, or change list options,
visit:
http://lists.midrange.com/mailman/listinfo/midrange-l
or email: MIDRANGE-L-request@xxxxxxxxxxxx Before posting, please take a moment to review the archives at
http://archive.midrange.com/midrange-l.
As an Amazon Associate we earn from qualifying purchases.