SQL, MDX, DAX – the languages of data

Ramblings of a retired data architect

Let me start by saying that I have been working with data for over thirty years. I think that just means I am old. Anyway, I have written blog posts, delivered presentations, and authored books on these languages through the years. Understanding and using these languages have grown and shaped my career through the years. I thought it would be fun to discuss my thoughts on each one. This is my take and some of my thoughts are definitely “tongue in cheek.” So, enjoy the ride and feel free to share your take in the comments.

  • SQL – ubiquitous and relational
  • MDX – complex and dimensional
  • DAX – formulaic and columnar
  • Thoughts and musings

SQL, structured query language

SQL is the oldest of the languages. It was designed to support relational databases (RDBMS). It is built on math principles to improve performance and optimize storage. Normalization rules were established to guide developers on the preferred approaches to building databases.

Why is SQL ubiquitous?

SQL is everywhere. SQL is the query language of choice for enterprise data platforms such as Microsoft SQL Server and Oracle. Open-source data platforms like MySQL and PostgreSQL are also built to use SQL.

How is this possible? SQL is an ANSI standard. This means that the core of the language is managed by a governing body. If you learn SQL, you should be able to write queries in all these databases, right? Sort of. You should be able to write a query like this in all the databases: SELECT field1, field2 FROM table WHERE field3 = 50.

However, vendors often implement their own variations of SQL to meet needs in their platform design or to provide nonstandard functionality to their users. (Engineering outpaces standards development.) For example, Microsoft created TSQL and Oracle created PL/SQL. One of my first experiences with this was returning a single row in query. I used TOP 1 in SQL Server, but there was no TOP keyword in Oracle.

Code examples

SQL Server

SELECT TOP 1 column1, column2 
FROM table_name
WHERE condition;

Oracle

SELECT column1, column2
FROM table_name
WHERE condition
AND ROWNUM = 1;

PostgreSQL / MySQL

SELECT column1, column2
FROM table_name
WHERE condition
LIMIT 1;

What does it all mean?

You can learn SQL and be efficient querying data from multiple data platforms. Whether you are a data engineer or a data analyst, you must know SQL if you are to be taken seriously as a data professional.

MDX, multidimensional expressions

I was introduced to MDX by SQL Server Analysis Services (SSAS). For me, it just clicked. More about that in a bit. MDX, like SQL, is heavily based on math. Whereas SQL is two dimensional (column and rows), MDX can theoretically use an unlimited number of dimensions. The number is limited in practice by the capability of the data platform. MDX was primarily used by two vendors, Microsoft and Hyperion.

One key difference between the two platforms is their purpose. Relational databases are optimized for transactions and small result sets. Multidimensional databases are built for analysis across huge datasets.

Why is MDX considered complex?

The toughest part for most data professionals is visualizing multidimensional datasets in their minds. Relational data is easy to visualize. It looks like a spreadsheet. Multidimensional data is not that simple. We call it a cube, but that is a simplistic representation with only three dimensions. It is a cool name though.

Earlier in my career I coached data consultants on their transition to BI consultants. As I helped a consultant with MDX, I told him at some point he would “get it.” I told him to call me when he did. Six months later he called me, told me that he got it, and hung up on me. Many consultants didn’t get it and either just forced their way through it or went back to relational.

MDX was designed to traverse dimensions, build sets, and aggregate values across those sets. I mentioned earlier that MDX made sense to me right away. The first time I was exposed to MDX, I learned about the various functions and methods to work with dimensions including child, parent, descendants, and ancestors. You could think of dimensions like family trees. I took a class in college about familial relationships which used similar concepts. My degree is in cultural anthropology.

  • Another difficult concept to master is context. You must understand the set or slice of data you are working with in a query.
  • Once you understand context, you need to realize that every measure is an aggregate of every dimension whether a part of the query or not.
  • Results can be shaped in multiple dimensions. Three or more dimensions cannot be visualized. If you want to visualize the data in a report, it needs to be formed into columns and rows.

Multidimensional databases and MDX are extremely powerful but complex. I enjoyed working with them and became one of the few experts in the technology. However, multidimensional databases and MDX are rarely used today. Microsoft is not advancing the technology, instead promoting columnar data structures.

DAX, data analysis expressions

My first experience with DAX was when PowerPivot was released with Excel. It was then that I saw the writing on the wall for MDX. DAX is simpler and more approachable than MDX. Microsoft then added tabular models built on the same data engine, Vertipaq. Eventually culminating in the Power BI model. The underlying data engine is a highly optimized columnar data structure.

Admittedly, I have the least amount of hands-on experience with DAX. However, I disliked it early on. Unlike SQL and MDX, DAX is not built around math principles and is not a query language. It is built with expressions. Instead of SELECT, it starts with an equal sign (=). This is more intuitive for Excel users. Early on, it was very frustrating for me.

DAX is continually being improved. Microsoft is also continuing to improve the underlying data engine and storage subsystem. Power BI models are one of the foundational building blocks of Microsoft Fabric.

Should you learn DAX?

If your business uses Power BI, then yes. DAX is used to aggregate, shape, and format data for usage by end users. It is not necessary for data engineers who don’t present data to end users.

Thoughts and musings

My first recommendation is to learn SQL if you want to be taken seriously as a data professional. It has been around since the beginning and will be around for a while to come. Relational data platforms are integrating columnar data storage technology which gives SQL users access to the performance available in Power BI models.

While I contend that MDX is more powerful, I concur that DAX is more approachable. As MDX goes the way of COBOL, SQL remains the powerhouse. Learn DAX if you intend to use Power BI, otherwise don’t bother.

That is my 2 cents. Have a different opinion? Sound off in the comments below.

T-SQL Window Functions – Part 4: Analytic Functions

This is a reprint with some revisions of a series I originally published on LessThanDot. You can find the links to the original blogs on my Series page.

TSQL-WIndow-Functions_thumb1_thumb_tIn the final installment of my series on SQL window functions, we will explore using analytic functions. Analytic functions were introduced in SQL Server 2012 with the expansion of the OVER clause capabilities. Analytic functions fall in to two primary categories: values at a position and percentiles. Four of the functions, LAG, LEAD, FIRST_VALUE and LAST_VALUE find a row in the partition and returns the desired value from that row. CUME_DIST and PERCENT_RANK break the partition into percentiles and return a rank value for each row. PERCENTILE_CONT and PERCENTILE_DISC a value at the requested percentile in the function for each row. All of the functions and examples in this blog will only work with SQL Server 2012.
Once again, the following CTE will be used as the query in all examples throughout the post:

with CTEOrders as
(select cast(1 as int) as OrderID, cast(‘3/1/2012’ as date) as OrderDate, cast(10.00 as money) as OrderAmt, ‘Joe’ as CustomerName
union select 2, ‘3/1/2012’, 11.00, ‘Sam’
union select 3, ‘3/2/2012’, 10.00, ‘Beth’
union select 4, ‘3/2/2012’, 15.00, ‘Joe’
union select 5, ‘3/2/2012’, 17.00, ‘Sam’
union select 6, ‘3/3/2012’, 12.00, ‘Joe’
union select 7, ‘3/4/2012’, 10.00, ‘Beth’
union select 8, ‘3/4/2012’, 18.00, ‘Sam’
union select 9, ‘3/4/2012’, 12.00, ‘Joe’
union select 10, ‘3/4/2012’, 11.00, ‘Beth’
union select 11, ‘3/5/2012’, 14.00, ‘Sam’
union select 12, ‘3/6/2012’, 17.00, ‘Beth’
union select 13, ‘3/6/2012’, 19.00, ‘Joe’
union select 14, ‘3/7/2012’, 13.00, ‘Beth’
union select 15, ‘3/7/2012’, 16.00, ‘Sam’
)
select OrderID
,OrderDate
,OrderAmt
,CustomerName
from CTEOrders;

Position Value Functions: LAG, LEAD, FIRST_VALUE, LAST_VALUE

Who has not needed to use values from other rows in the current row for a report or other query? A prime example is needing to know what the last order value was to calculate growth or just show the difference in the results. This has never been easy in SQL Server until now. All of these functions require the use of the OVER clause and the ORDER BY clause. They all use the current row within the partition to find the row at the desired position.

The LAG and LEAD functions allow you to specify the offset or how many rows to look forward or backward and they support a default value in cases where the value returned would be null. These functions do not support the use of ROWS or RANGE in the OVER clause. The FIRST_VALUE and LAST_VALUE allow you to further define the partition using ROWS or RANGE if desired.

The following example illustrates all of the functions with various variations on the parameters and settings.

select OrderID
,OrderDate
,OrderAmt
,CustomerName
,LAG(OrderAmt) OVER (PARTITION BY CustomerName ORDER BY OrderID) as PrevOrdAmt
,LEAD(OrderAmt, 2) OVER (PARTITION BY CustomerName ORDER BY OrderID) as NextTwoOrdAmt
,LEAD(OrderDate, 2, ‘9999-12-31’) OVER (PARTITION BY CustomerName ORDER BY OrderID) as NextTwoOrdDtNoNull
,FIRST_VALUE(OrderDate) OVER (ORDER BY OrderID) as FirstOrdDt
,LAST_VALUE(CustomerName) OVER (PARTITION BY OrderDate ORDER BY OrderID) as LastCustToOrdByDay

from CTEOrders

Results (with shortened column names):
ID OrderDate Amt Cust PrevOrdAmt NextTwoAmt NextTwoDt FirstOrd LastCust
1 3/1/2012 10 Joe NULL 12 3/3/2012 3/1/2012 Joe
2 3/1/2012 11 Sam NULL 18 3/4/2012 3/1/2012 Sam
3 3/2/2012 10 Beth NULL 11 3/4/2012 3/1/2012 Beth
4 3/2/2012 15 Joe 10 12 3/4/2012 3/1/2012 Joe
5 3/2/2012 17 Sam 11 14 3/5/2012 3/1/2012 Sam
6 3/3/2012 12 Joe 15 19 3/6/2012 3/1/2012 Joe
7 3/4/2012 10 Beth 10 17 3/6/2012 3/1/2012 Beth
8 3/4/2012 18 Sam 17 16 3/7/2012 3/1/2012 Sam
9 3/4/2012 12 Joe 12 NULL 12/31/9999 3/1/2012 Joe
10 3/4/2012 11 Beth 10 13 3/7/2012 3/1/2012 Beth
11 3/5/2012 14 Sam 18 NULL 12/31/9999 3/1/2012 Sam
12 3/6/2012 17 Beth 11 NULL 12/31/9999 3/1/2012 Beth
13 3/6/2012 19 Joe 12 NULL 12/31/9999 3/1/2012 Joe
14 3/7/2012 13 Beth 17 NULL 12/31/9999 3/1/2012 Beth
15 3/7/2012 16 Sam 14 NULL 12/31/9999 3/1/2012 Sam

If you really like subselects, you can also mix in some subselects and have a very creative SQL statement. The following statement uses LAG and a subselect to find the first value in a partition. I am showing this to illustrate some more of the capabilities of the function in case you have a solution that requires this level of complexity.

select OrderID
,OrderDate
,OrderAmt
,CustomerName
,LAG(OrderAmt, (
select count(*)-1
from CTEOrders c
where c.CustomerName = CTEOrders.CustomerName
and c.OrderID <= CTEOrders.OrderID), 0)
OVER (PARTITION BY CustomerName ORDER BY OrderDate, OrderID) as FirstOrderAmt
FROM CTEOrders

Percentile Functions: CUME_DIST, PERCENT_RANK, PERCENTILE_CONT, PERCENTILE_DISC

As I wrap up my discussion on window functions, the percentile based functions were the functions I knew the least about. While I have already used the position value functions above, I have not yet needed to use the percentiles. So, that meant I had to work with them for a while so I could share their usage and have some samples for you to use.

The key differences in the four function have to do with ranks and values. CUME_DIST and PERCENT_RANK return a ranking value while PERCENTILE_CONT and PERCENTILE_DISC return data values.

CUME_DIST returns a value that is greater than zero and lest than or equal to one (>0 and <=1) and represents the percentage group that the value falls into based on the order specified. PERCENT_RANK returns a value between zero and one as well (>= 0 and <=1). However, in PERCENT_RANK the first group is always represented as 0 whereas in CUME_DIST it represents the percentage of the group. Both functions return the last percent group as 1. In both cases, as the ranking percentages move from lowest to highest, each group’s percent value includes all of the earlier values in the calculation as well.

The following statement shows both of the functions using the default partition to determine the rankings of order amounts within our dataset.

select OrderID
,OrderDate
,OrderAmt
,CustomerName
,CUME_DIST() OVER (ORDER BY OrderAmt) CumDist
,PERCENT_RANK() OVER (ORDER BY OrderAmt) PctRank
FROM CTEOrders

Results:
OrderID OrderDate OrderAmt CustomerName CumDist PctRank
1 3/1/2012 10 Joe 0.2 0
3 3/2/2012 10 Beth 0.2 0
7 3/4/2012 10 Beth 0.2 0
2 3/1/2012 11 Sam 0.33333333 0.214285714
10 3/4/2012 11 Beth 0.33333333 0.214285714
6 3/3/2012 12 Joe 0.46666667 0.357142857
9 3/4/2012 12 Joe 0.46666667 0.357142857
14 3/7/2012 13 Beth 0.53333333 0.5
11 3/5/2012 14 Sam 0.6 0.571428571
4 3/2/2012 15 Joe 0.66666667 0.642857143
15 3/7/2012 16 Sam 0.73333333 0.714285714
5 3/2/2012 17 Sam 0.86666667 0.785714286
12 3/6/2012 17 Beth 0.86666667 0.785714286
8 3/4/2012 18 Sam 0.93333333 0.928571429
13 3/6/2012 19 Joe 1 1

The last two functions, PERCENTILE_CONT and PERCENTILE_DISC, return the value at the percentile requested. PERCENTILE_CONT will return the true percentile value whether it exists in the data or not. For instance, if the percentile group has the values 10 and 20, it will return 15. If PERCENTILE_DISC, is applied to the same group it will return 10. It will return the smallest value in the percentile group, which in this case is 10. Both functions ignore NULL values and do not use the ORDER BY, ROWS, or RANGE clauses with the PARTITION BY clause. Instead, WITHIN GROUP is introduced which must contain a numeric data type and ORDER BY clause. Only one column can be specified here. Both functions need a percentile value which can be between 0.0 and 1.0.

The following script illustrates a couple of variations. The first two functions return the median of the default partition. Then next two return the median value for each day. Finally, the last two functions return the low and high values within the partition. The values segmented by the date partition highlight the key difference between the two functions.

select OrderID as ID
,OrderDate as ODt
,OrderAmt as OAmt
,CustomerName as CName
,PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY OrderAmt) OVER() PerCont05
,PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY OrderAmt) OVER() PerDisc05
,PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY OrderAmt) OVER(PARTITION BY OrderDate) PerContDt
,PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY OrderAmt) OVER(PARTITION BY OrderDate) PerDiscDt
,PERCENTILE_CONT(0) WITHIN GROUP (ORDER BY OrderAmt) OVER() PerCont0
FROM CTEOrders

Results
ID ODt OAmt CName PerCont05 PerDisc05 PerContDt PerDiscDt PerCont0
1 3/1/2012 10 Joe 13 13.00 10.5 10.00 10
2 3/1/2012 11 Sam 13 13.00 10.5 10.00 10
3 3/2/2012 10 Beth 13 13.00 15.0 15.00 10
4 3/2/2012 15 Joe 13 13.00 15.0 15.00 10
5 3/2/2012 17 Sam 13 13.00 15.0 15.00 10
6 3/3/2012 12 Joe 13 13.00 12.0 12.00 10
7 3/4/2012 10 Beth 13 13.00 11.5 11.00 10
10 3/4/2012 11 Beth 13 13.00 11.5 11.00 10
9 3/4/2012 12 Joe 13 13.00 11.5 11.00 10
8 3/4/2012 18 Sam 13 13.00 11.5 11.00 10
11 3/5/2012 14 Sam 13 13.00 14.0 14.00 10
12 3/6/2012 17 Beth 13 13.00 18.0 17.00 10
13 3/6/2012 19 Joe 13 13.00 18.0 17.00 10
14 3/7/2012 13 Beth 13 13.00 14.5 13.00 10
15 3/7/2012 16 Sam 13 13.00 14.5 13.00 10

As I wrap up this post, I have to give a shout out to my daughter, Kristy, who is an honors math student. She helped me get my head around this last group of functions. Her honors math work and some statistical work she had done in science helped provide additional insight into the math behind the functions. (Kristy – you rock!)

Series Wrap Up

I hope this series helps everyone understand the power and flexibility in the window functions made available in SQL Server 2012. If you happen to use Oracle, I know that many of these functions or there equivalent are also available in 11g and they also appear to be in 10g. I have to admit my first real production usage was with Oracle 11g but has since used them with SQL Server 2012. The expanded functionality in SQL Server 2012 is just one more reason to upgrade to the latest version.

Oracle Tips for MSBI Devs #6: Supporting SSAS Tabular Development

As SQL Server Analysis Services Tabular Models become more popular, models will use Oracle databases as sources. One of the key issues whenever you work with Oracle is understanding how to properly configure the necessary components to enable development.

Getting Started

If you have worked with Oracle before, you are very aware of a few things you need to be successful. First, you need to install the Oracle client. Here is where the details get messy. When you are working with MSBI tools, you will be using SQL Server Data Tools in Visual Studio which is still only 32 bit. Of the BI tools in SSDT, only SSIS has run modes to support 32 bit and 64 bit configurations. As a result, you need to install the 32 bit Oracle client in order to develop your tabular model.

Once that has been installed you will need to update the TNSNAMES.ORA file with the servers you will be targeting during development. Ideally, your Oracle DBAs have a file for you to use so you don’t need to create one. One nice thing is that the Oracle 12c client updates the PATH environment variable with the location of the bin folder. (Yes, Oracle still uses environment variables.) I would also recommend adding or using the TNS_ADMIN variable to specify the location of the TNSNAMES.ORA file. (See http://www.orafaq.com/wiki/TNS_ADMIN for details.)

NOTE: It took me many hours to work through a variety of configuration issues related to working with the Oracle client install. A couple of reinstalls, reboots, TNSNames.ORA tweaks, and lots of fruitless searching were all required to get this working. Be warned, working with Oracle clients are neither fun nor simple.

The Issue

Now that you have the 32 bit client installed you can connect to the Oracle database through the tabular model designer. As shown below, you can connect to Oracle through the Table Import Wizard.

image

You will be able to successfully test the connection as noted here.

image

And you will be able to execute a query and get results. You can also use the option to select tables and views.

image

However, once you decide to import the data you will encounter the following error:

image

The issue is that while you can do most of your work within Visual Studio using the 32 bit client, the import process targets the SQL Server tabular instance you specified when you created the project. While the 32 bit version of SQL Server is still available, most of us would not install that, even in our development environments. If you do not encounter this error, you are either using the 32 bit client of SQL Server or you have the 64 bit Oracle client installed (more on that next). As long as Visual Studio is only 32 bit compliant and you choose to use the 64 version of SQL Server you will see this issue.

The Resolution

The resolution is fairly simple. You need to download and install the 64 bit Oracle client. I would recommend that you get it installed, then reboot your development PC. While this may not be required, it seems to have helped me with a number of connectivity issues. You will need to be prepared for some “interesting” issues as you will have more than one Oracle home installed and you have the potential of many types of ORA-XXXXX errors. Once you are up and running you should be able to develop tabular models built on Oracle databases.

Some Parting Thoughts

First, I want to be clear that I think that Oracle is a solid database platform. However, I have never been at a client site or on a project where the connectivity or client installs were totally correct or functional without some work between the Oracle team and the BI development team. I think that the .NET driver is supposed to better and I may try that out for a later post (when I have the hours to spare).

I did the testing for this completely on Azure (and my Surface). I set up an Oracle VM and a SQL Server VM on Azure. The Microsoft team put together a great reference on setting up your Oracle VM. Check it out. I also did a previous post on setting up Oracle in an Azure VM. Both VM types can be pricey, but in a testing environment all was not too bad. I encourage you to use Azure to for these types of scenarios. But be sure to turn it off when you are done.

Oracle Tips for MSBI Devs #5: Working with Oracle on Windows Azure

As you have likely noticed in my series, Oracle Tips for MSBI Devs, I have done a lot of work with Oracle through the years while delivering BI solutions. One of the pain points of working with Oracle in development is setting up an Oracle development server. Even though I have installed Oracle a number of times, it is never seems to be an easy process.

So, I decided to try out the Oracle Virtual Machine template in Windows Azure. I will walk through the setup process here. I need to use Oracle as a data source for some SSIS development.

Setting Up the VM

From the Windows Azure portal, select the Virtual Machines tab then “Create a Virtual Machine”. This will open up the variety of options available to create the VM. Select the FROM GALLERY option which will open another dialog.

image

On the next screen, you pick the edition Oracle you want to use for the VM. (NOTE: at the moment, Oracle images are in preview. Microsoft recently that Oracle VMs will be be available on March 12. You can find more information here.)

image

I will be using the Oracle 11g R2 Standard Edition on Windows Server 2008 R2. The next step is to name and pick the size of the VM. The minimum size for this is Small and is what I used. I then completed the setup including setting up the endpoints and creating a new user.

I had originally tried to use Oracle 12c, but significant changes have been made to support multitenancy which make set up considerably more tedious with very few good examples available on the web. Most of the advice given by Oracle pros was to “Read the _____ Manual!” While “sensible”, I just needed a simple dev environment. This is one of the significant advantages of working with SQL Server, community help is abundant and usually pleasant. For instance, Microsoft recently published a document for setting up the Oracle 12c VM. I used it to work through some of the setup instructions below.

Once the initialization was complete I used the connect image button to open an RDP connection to the VM from the Azure dashboard. One thing to keep in mind, be sure to keep track of the user name and password you created. This your admin account and you will need it to log in to the VM. Now you have a running VM. At this point, I went and found the Oracle tools that I typically use and pinned them to the task bar.

Creating and Loading an Oracle Schema

Because I always for get some of these steps, and I really don’t want to read the manual, I listed the steps I used to create a simple set of data for use. This is not good enough for production, but it is a starting point.

Step 1: Create a Listener. This is required before you can create a database. To do this open the Oracle Net Configuration Assistant. From here you can create your first listener. I left the default settings for the setup.

Step 2: Create the database. (This is the equivalent of an instance in SQL Server.)  I used the Database Configuration Assistant for Oracle Database to create my first database on the server. This can be found in the Oracle home directory on the start menu.I chose the General Purpose template for my database. Most of the steps make some sense. I did choose to add the sample schemas as this is the easiest way to verify I can connect and work with the data. After all of this, the database will be created based on your choices.

Step 3: Using SQL*Plus, I connected to the SYSTEM schema. The user-name in this case is “SYSTEM”. Now we

Step 4: Create a new user and schema. (This is similar to the SQL Server database, not a SQL Server Schema.) This will give a location to create tables for loading data in the next steps. In accordance with typical Oracle support you can read about how to do this here: http://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_8003.htm#i2065278. Or I can give you a good starting script and save you time.

CREATE USER someusername
IDENTIFIED BY <<add password here>>
DEFAULT TABLESPACE example
QUOTA 10M ON example
TEMPORARY TABLESPACE temp
QUOTA 5M ON system;

Your welcome! This will create a database using existing tablespaces. This is NOT a production level script and it is barely good enough for development. But in my case, I am using Oracle as a source and don’t plan to do much development or work on it so it meets my needs. If you need more insight, I guess you will need to read the documentation.

Step 5: Create a table and add rows. I continued to use the SYSTEM login and created a couple of simple tables within my new schema. I then used simple INSERT INTO statements to add data.

Now you have some basic data to work with to test connectivity with SSIS or SSAS.

Making the Oracle Database Accessible

In order to access your Oracle database on the VM you need to enable the port. You do this by going to the Azure portal and selecting the VMs tab. Once there, go to the Endpoints tab. You may recall that when you created the VM, you were asked about the Remote Desktop and PowerShell ports. Here are the steps to create the Endpoint to support Oracle.

  1. Click Add to open the Add Endpoint dialog.
  2. On the first page, leave the default which will add a stand-alone endpoint.
  3. On the second page you need to add a name (I used “Oracle”), select the TCP protocol, and put port 1521 in both the private and public port textboxes.

Once completed you should see the new endpoint in the list of available endpoints as shown below.

image

Connecting SSIS to that Oracle Database

Now that we have data in the db, it is time to create the connection to SSIS and load data and run queries.

The first thing I needed to do was load the client tools. I used the newer Oracle Data Access Components (ODTwithODAC12012). Once that was loaded, I added the following entry to my TNSNames.ora file (look for that in a directly like the following: c:\app\<username>\product\12.1.0\client_1\Network\Admin):

ORACLEDW =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = <servername>.cloudapp.net) (PORT  1521)
(CONNECT_DATA =
(SERVER=DEDICATED)
(SERVICE_NAME = ORACLEDW)
)
)

The key parts to get right are the HOST, PORT, and SERVICE_NAME as highlighted above.

Once TNS was in place, I was able to create an ODP.NET connection to the database and proceed to load the data.

I know that some of this has been simplistic but it is great that I don’t have to install Oracle myself. This functionality makes Azure even more appealing as a hosting solution.

SQL Saturday#197–Omaha Recap

96791a10-4559-4bac-bb98-c25ebc5e52c6

This was the second SQL Saturday hosted in Omaha.  I loved to see how the event grew from the first event until now.  John Morehouse ( T | B ) and team did another stellar job organizing this event. I know they packed the house

Having taken part in the first event, I found it spectacular that the speaker list was so diverse.  It is great to see so many SQL Server pros come out and speak at these events.

Part of the fun for me was bringing my 11-year old daughter along.  Many of you, speakers, attendees, and sponsors were kind to her and she had a good time, even though much of it was spent using my Surface to watch Netflix.  Smile  As a speaker, this was a way to spend some time on the road with her and to introduce one of my children to what I do when I travel to these events (one of my sons will be joining me in Fargo).

I do have to say that the food, both at the speaker’s dinner and for lunch were awesome!  If you are looking for an event that will feed you well, be sure to try this event next time around.

I was able to attend a few of the sessions, but I wanted to mention that the SQL Server vs Oracle: The Throwdown! was really good.  As a cross-over platform developer (check out my Oracle for MSBI Tips), it was great having a SQL Server Pro, David Klee (@kleegeek), and an Oracle Pro, Joe Grant (@dba_jedi), co-present.  Nice work guys!

Finally, I presented on Building BI Solutions with Excel 2013.  I have uploaded the slides to the event site.  Until next time.