Saturday, 21 December 2013

What is Biml?

Business Intelligence Markup Language (Biml) is a domain specific XML dialect used to specify end-to-end business intelligence (BI) and data warehouse (DW) solutions.  Currently Biml supports the representation of relational models, data transformation packages, multi-dimensional models (OLAP), permissions, databases, and flat files.
Why Biml?
The Microsoft SQL Server Business Intelligence stack offers an excellent platform for the development of BI / DW solutions.  SQL Server is a robust, reliable and enterprise-grade relational database; and SSIS and SSAS have robust processing engines that have proven themselves in some of the toughest situations.  In addition to the robust processing engines, Microsoft has targeted their tools to be comfortable to the IT generalist working in heterogeneous environments.  This is accomplished by making the tools highly graphical so that they will appeal to a broad range of needs and skill sets. 
The Microsoft BI Stack can be adopted in whole or in part.  For instance, SSAS can work with an Oracle database, and SSIS can be installed and used without a SQL Server database. The strengths of the Microsoft BI Stack were achieved with the following disadvantages: 
  1. The Business Intelligence Development Studio (BIDS) and SQL Server Data Tools (SSDT) emphasis on graphical tools has implications for file format. The dtsx file format in SSIS 2005 and 2008 was not designed to be readable or writable by humans.  While SSIS 2012’s file format is human-readable it is still not designed to be human-writable. The dtsx file tracks a large amount of non-semantic data such as design surface coordinates, GUIDs, and other designer-specific markup. Change tracking and merging is not practical.  It is easier in SSIS 2012, but small changes in a package can still result in many changes to the file that have little or nothing to do with semantic updates.
  2. The Business Intelligence Development Studio (BIDS) and SQL Server Data Tools (SSDT) emphasis on graphical tools has implications for usability. Details require context switching between Control Flow and Data Flow. The tools have limited search, find/replace, and compare tools. Non-semantic changes result in file changes. For example, changing the position of a task on the Control Flow results in a change just like rewriting the contents of a Script Task. Changes to code are mixed with changes to non-code.
  3. Business Intelligence Development Studio (BIDS) and SQL Server Data Tools (SSDT) architectural limitations and bugs. An incrementally edited graphical design surface can result in lineage tracking that is easily corrupted. Package edits can only be done one at a time and edits cannot be applied to a set of packages.  This makes refactoring many packages costly, time-consuming and error prone. BIDS and SSDT limit the use of implementing many aspects of agile development methodologies such as continuous integration, code reviews, incremental development, and code refactoring.
  4. The emphasis on piecemeal adoption has implications for development model. There are different projects and user interfaces for different levels of the Business Intelligence (BI) stack. Shared aspects of the solution must be re-specified or imported at every level in the BI stack. Changes must be synchronized across all levels of the BI stack.
  5. The focus on the generalist has implications for abstraction model. Inherently, there are very limited capabilities for abstraction and reuse. There are no inherent capabilities for formalizing patterns and practices.
 

What Can You Do With Biml?
Suppose your manager tasks you with building a data warehouse? The source data resides in 100 tables on the system of record, and each table is designed differently. Based on an initial analysis, you realize an incremental load pattern will best meet the requirements. Manually coding 100 SSIS packages using the Incremental Load design pattern will take time. How much time? I usually estimate two hours per SSIS package (which includes development and unit testing). If that holds for your example, that’s 200 hours of SSIS development.
Biml allows you to create a template that will generate an Incremental Load SSIS package. In this Stairway Series, we will do just that in Building an Incremental Load Package – Level 3. Using Biml to generate a template for a single SSIS package is cool but it doesn’t really solve your problem. The fourth installment in this series helps. Using Biml as an SSIS Design Patterns Engine – Level 4 takes extends the functionality of the Incremental Load Biml file built in Level 3. The result is a Biml file that will build an incremental load SSIS package for each table in a source database. In your use case, that would mean the Biml file would construct 100 SSIS packages in a matter of minutes.
But there’s more. Because the SSIS packages are generated from the same template, the quality of the code increases while the development time decreases. Increased quality further translates into reduced testing requirements. In his now-classic 1986 article, No Silver Bullet—Essence and Accidents of Software Engineering, Frederick Brooks postulates "there is no single development, in either technology or management technique, which by itself promises even one order of magnitude [tenfold] improvement within a decade in productivity, in reliability, in simplicity." Mr. Brooks may be right, but Biml comes pretty close to hitting (and exceeding) that mark when it comes to certain data integration use cases for SSIS.

No comments:

Post a Comment