Business
Intelligence Markup Language (Biml) is a domain specific XML dialect used to
specify end-to-end business intelligence (BI) and data warehouse (DW)
solutions. Currently Biml supports the representation of relational
models, data transformation packages, multi-dimensional models (OLAP),
permissions, databases, and flat files.
Why Biml?
The
Microsoft SQL Server Business Intelligence stack offers an excellent platform
for the development of BI / DW solutions. SQL Server is a robust,
reliable and enterprise-grade relational database; and SSIS and SSAS have
robust processing engines that have proven themselves in some of the toughest
situations. In addition to the robust processing engines, Microsoft has
targeted their tools to be comfortable to the IT generalist working in
heterogeneous environments. This is accomplished by making the tools
highly graphical so that they will appeal to a broad range of needs and skill
sets.
The
Microsoft BI Stack can be adopted in whole or in part. For instance, SSAS
can work with an Oracle database, and SSIS can be installed and used without a
SQL Server database. The strengths of the Microsoft BI Stack were achieved with
the following disadvantages:
- The Business Intelligence Development Studio (BIDS) and SQL Server Data Tools (SSDT) emphasis on graphical tools has implications for file format. The dtsx file format in SSIS 2005 and 2008 was not designed to be readable or writable by humans. While SSIS 2012’s file format is human-readable it is still not designed to be human-writable. The dtsx file tracks a large amount of non-semantic data such as design surface coordinates, GUIDs, and other designer-specific markup. Change tracking and merging is not practical. It is easier in SSIS 2012, but small changes in a package can still result in many changes to the file that have little or nothing to do with semantic updates.
- The Business Intelligence Development Studio (BIDS) and SQL Server Data Tools (SSDT) emphasis on graphical tools has implications for usability. Details require context switching between Control Flow and Data Flow. The tools have limited search, find/replace, and compare tools. Non-semantic changes result in file changes. For example, changing the position of a task on the Control Flow results in a change just like rewriting the contents of a Script Task. Changes to code are mixed with changes to non-code.
- Business Intelligence Development Studio (BIDS) and SQL Server Data Tools (SSDT) architectural limitations and bugs. An incrementally edited graphical design surface can result in lineage tracking that is easily corrupted. Package edits can only be done one at a time and edits cannot be applied to a set of packages. This makes refactoring many packages costly, time-consuming and error prone. BIDS and SSDT limit the use of implementing many aspects of agile development methodologies such as continuous integration, code reviews, incremental development, and code refactoring.
- The emphasis on piecemeal adoption has implications for development model. There are different projects and user interfaces for different levels of the Business Intelligence (BI) stack. Shared aspects of the solution must be re-specified or imported at every level in the BI stack. Changes must be synchronized across all levels of the BI stack.
- The focus on the generalist has implications for abstraction model. Inherently, there are very limited capabilities for abstraction and reuse. There are no inherent capabilities for formalizing patterns and practices.
What Can You Do
With Biml?
Suppose
your manager tasks you with building a data warehouse? The source data resides
in 100 tables on the system of record, and each table is designed differently.
Based on an initial analysis, you realize an incremental load pattern will best
meet the requirements. Manually coding 100 SSIS packages using the Incremental
Load design pattern will take time. How much time? I usually estimate two hours
per SSIS package (which includes development and unit testing). If that holds
for your example, that’s 200 hours of SSIS development.
Biml
allows you to create a template that will generate an Incremental Load SSIS
package. In this Stairway Series, we will do just that in Building an
Incremental Load Package – Level 3. Using Biml to generate a template for a
single SSIS package is cool but it doesn’t really solve your problem. The
fourth installment in this series helps. Using Biml as an SSIS Design Patterns
Engine – Level 4 takes extends the functionality of the Incremental Load Biml
file built in Level 3. The result is a Biml file that will build an incremental
load SSIS package for each table in a source database. In your use case,
that would mean the Biml file would construct 100 SSIS packages in a matter of
minutes.
But
there’s more. Because the SSIS packages are generated from the same template,
the quality of the code increases while the development time decreases.
Increased quality further translates into reduced testing requirements. In his
now-classic 1986 article, No
Silver Bullet—Essence and Accidents of Software Engineering, Frederick
Brooks postulates "there is no single development, in either
technology or management technique, which by itself promises even one order of
magnitude [tenfold] improvement within a decade in productivity, in
reliability, in simplicity." Mr. Brooks may be right, but Biml comes
pretty close to hitting (and exceeding) that mark when it comes to certain data
integration use cases for SSIS.