This is a tool to do data extraction. The term is well known as ETL, Extraction, Transform and Load.
I would say that is very easy to use; its proprietary script language is very basic and since it is mostly 5GL a lot of “coding” is just assembling (drag&drop and connecting) some predefined objects.
I also had some experience with SAS ETL tool and I have to say that both of them are very similar. Sadly my interaction with SAS tool was very limited and I cannot do a fully comparison against BO tool.
Although I am an “open source” guy I still have to do my job with whatever tools my employer has. So when it was decided that we were going to use BODI, I accepted the challenge and started reading some documentation and doing some prototypes. Everything seemed to be fairly easy…
After some weeks some questions started to come… how do we implement this complex logic? how do we deal with transactions? Where do we use scripting against 5GL coding?
So we finally found some caveats using this tool:
- No complex logic! If your process has to do validations then do all (or most part) of them before trying to do any extraction.
- Transactions! I understand the whole flow (job in terms of ETL) should be done in a single transaction, but if you do some scripting then forget about it; every SQL statement that you send thru it will be committed and not roll-backed if something goes wrong (you have a Catch object though).
- Predefined objects! Being a 5GL tool it has some objects that will do some tasks for you and this looks great until… you get to know that in most of the cases you have to give the big work load to DB. Yes I know that’s why DB exist but I expected that components of this tool do that for me (in an efficient way). I mean if I have two or more tables with millions of records and I want to do a join I expected that an optimized query were built and executed by the tool (based on the table objects and columns to do joins) ; instead as result I got this tool trying to load millions and millions of records doing a join in memory… “all that glitters is not gold”.
In defense of BODI I have to say that probably I broke all the established best practices but in my favor I have to say that there is an incredible lack of documentation; so if you are looking for some help you can read/post in this forum.
The advantages:
- Rapid development. If you need to do a bulk of data thru different types of relational DB’s (or file to file, relational to file and vice-versa) then this is a good solution.
- Batch processes. If you have to do data extraction continuously; it has a simple but good interface that will allow you to setup your job to run N times per day (hour, week, etc).
Doing some googling I found a site with open source ETL projects; I have not tried them yet but maybe I will give it a try to some of them at home just for fun…
Ok – Even if it is a very basic info I hope this help you to know the to-do/not-to-do with BODI. Also you will find the forum very helpful, it is frustrating to do research for hours (days) without finding anything; so the forum is an oasis in the middle of the desert.
Cheers,
Buho


Recent Comments