Wednesday, August 5, 2009

August 2009 Baton Rouge SQL Server User Group – Near Real-time Data Warehousing 1.0

Fresh off of my presentation of the SQL Profiler at SQL Saturday #17 in Baton Rouge this past Saturday, I am now preparing to speak at our next user group. The topic is Near Real-time Data Warehousing. I gave this talk at the SQL Saturday in Pensacola it was well received. However, before I could really get started presenting in Pensacola one of the participants asked, “Why do you need a Near Real-time Data Warehouse”. I was prepared to answer that question, but instead of using the rehearsed version of the answer I decided to traverse down another path. My answer went like this:

Have you ever actually tried to find a “Needle in a hay stack”? Of course trying to find that needle would by very difficult and time consuming and everyone agreed. I went back to my slide deck and revealed a query that consisted of THIRTY-FIVE joins. The results of that query was simply a detailed customer listing, which included columns like first name, last name, address, phone number, city, state, etc.. I continued by explaining why the query was so long and complicated. Our developers designed somewhat of an Object-Oriented database. As a result, we have a database that is at a level of normalization that does not exist (smile). Therefore, our team was posed with the task of de-normalizing the database to make it more report friendly.

The project was initially tasked as flattening the operational tables into a less normalized state, which was to include lots and lots of T-SQL and SSIS packages. However, as the project progressed someone suggested that we just make it a warehouse and try to populate it at certain intervals though out the day. I had built some very small warehouses, but nothing of this magnitude and visibility. As a result, I accepted the task. The result of the project is six FACT table and 23 dimensions that are built in intervals of 15 minutes using the following technologies:
1. Transactional Replication
2. Triggers
3. SQL CLR
4. Message Queuing
5. SSIS
6. A Custom ETL application.

If you want to find out more and you can’t make it to our local SQL Server Group (go to http://batonrouge.sqlpass.org to get information about our local group) send me an email at pleblanc@tsqlscripts.com to receive and invitation to a LIVE MEETING broadcast of our use group. If you can’t make either don’t worry, I am writing an article on the entire process that hopefully will be published somewhere soon. Please note that this is version 1.0, version 2.0 does not rely on items 2,4, and 6 of the aforementioned list. Instead we have incorporated Change Data Capture, which makes life so much easier.

Talk soon Patrick LeBlanc,
SQL Down South.

No comments:

Post a Comment