tag:blogger.com,1999:blog-59968433624240132372024-03-13T11:35:27.902+02:00Janne Pyykkö's BI BlogBI, DW, OLAP, charting, reporting, data visualization, dashboards, web user interfaces, searching for best practicesJanne Pyykköhttp://www.blogger.com/profile/09970278371612488984noreply@blogger.comBlogger24125tag:blogger.com,1999:blog-5996843362424013237.post-26469258387993093232008-04-06T14:20:00.015+02:002008-04-17T07:00:10.831+02:00Google's heat map and motion chart in comparisonIn addition to the <a href="http://documents.google.com/support/spreadsheets/bin/answer.py?answer=91610">motion chart</a> (reviewed <a href="http://jpbi.blogspot.com/2008/03/motion-chart-provided-by-google.html">here</a>), Google announced a <a href="http://documents.google.com/support/spreadsheets/bin/answer.py?answer=91599">heat map gadget</a>. It's quite easy to use just like the motion chart. That is, the most difficult part is to collect data for the chart in the right format.<br /><br />The following heat maps display two variables, the <em>median age (years)</em> and the <em>unemployment rate (%)</em>, of countries by using colors on the world map (data source: <a href="https://www.cia.gov/library/publications/the-world-factbook/">CIA's World Factbook</a>).<br /><br /><a href="http://3.bp.blogspot.com/_GY_NfovZtsk/R_jIjMJsdNI/AAAAAAAAASM/o91PkBMg2YI/s1600-h/medianage.PNG"><img id="BLOGGER_PHOTO_ID_5186115477799990482" style="CURSOR: hand" alt="" src="http://3.bp.blogspot.com/_GY_NfovZtsk/R_jIjMJsdNI/AAAAAAAAASM/o91PkBMg2YI/s320/medianage.PNG" border="0" /></a><br /><br /><div><a href="http://4.bp.blogspot.com/_GY_NfovZtsk/R_jIWcJsdMI/AAAAAAAAASE/gn_gYZeVdoc/s1600-h/unemployment.PNG"><img id="BLOGGER_PHOTO_ID_5186115258756658370" style="CURSOR: hand" alt="" src="http://4.bp.blogspot.com/_GY_NfovZtsk/R_jIWcJsdMI/AAAAAAAAASE/gn_gYZeVdoc/s320/unemployment.PNG" border="0" /></a><br /><br />How useful is the heat map concept? Is it a good infographic?<br /></div><div>It depends: If your target is to visualize every single variable clearly, the concept works. If your target is to help your audience understand the relationship between variables, then an XY chart works better (such as Google's motion chart--here without the animation):<br /></div><div><a href="http://3.bp.blogspot.com/_GY_NfovZtsk/R_jIqMJsdOI/AAAAAAAAASU/ndPXCM0k0JU/s1600-h/motionversion.PNG"><img id="BLOGGER_PHOTO_ID_5186115598059074786" style="CURSOR: hand" alt="" src="http://3.bp.blogspot.com/_GY_NfovZtsk/R_jIqMJsdOI/AAAAAAAAASU/ndPXCM0k0JU/s400/motionversion.PNG" border="0" /></a><br /></div><div>Let's analyze: By average, the lower the median age is in a country, the higher the unemployment rate.</div><br /><div><span style="color:#cc0000;">UPDATE (Thursday, April 17, 2008):</span><br /></div><div><span style="color:#cc0000;">Included below a more traditional chart to display the relationship between median age and unemployment rate. In this chart, countries are grouped in <em>median age groups</em>. An average unemployment rate is calculated for each group.</span></div><div><a href="http://1.bp.blogspot.com/_GY_NfovZtsk/SAbV0ug6ZTI/AAAAAAAAAS0/vlcHXk7AQqI/s1600-h/unemployment+median+age.PNG"><img id="BLOGGER_PHOTO_ID_5190070722407064882" style="CURSOR: hand" alt="" src="http://1.bp.blogspot.com/_GY_NfovZtsk/SAbV0ug6ZTI/AAAAAAAAAS0/vlcHXk7AQqI/s320/unemployment+median+age.PNG" border="0" /></a></div>Janne Pyykköhttp://www.blogger.com/profile/09970278371612488984noreply@blogger.com2tag:blogger.com,1999:blog-5996843362424013237.post-14624491898692906432008-03-24T16:49:00.009+02:002008-03-24T20:13:11.039+02:00Motion Chart provided by Google--an outstanding BI tool for freeGoogle just <a href="http://googledocs.blogspot.com/2008/03/collaboration-goes-one-level-deeper.html">announced</a> a <a href="http://code.google.com/apis/visualization/documentation/gadgetgallery.html">gadget gallery</a> containing one gem for the business intelligence consultants. It's a <a href="http://documents.google.com/support/spreadsheets/bin/answer.py?answer=91610">motion chart</a>, that is, an XY chart with a time dimension (feel free to play with it).<br /><br /><iframe style="BORDER-RIGHT: #cccccc 1px solid; BORDER-TOP: #cccccc 1px solid; BORDER-LEFT: #cccccc 1px solid; BORDER-BOTTOM: #cccccc 1px solid" src="http://k2alr2pc-a.gmodules.com/ig/ifr?up__table_query_url=http%3A%2F%2Fspreadsheets.google.com%2Ftq%3Frange%3DA1%253AF301%26key%3Dp-x-J5XQJ2IV6qFvcDJO_EQ%26gid%3D0%26pub%3D1&up_title=Motion%20chart%20test%2C%2010%20countries&up_state=&up__table_query_refresh_interval=0&url=http%3A%2F%2Fwww.google.com%2Fig%2Fmodules%2Fmotionchart.xml" frameborder="0" width="460" scrolling="no" height="360"></iframe><br /><br />Explanation: The XY chart above displays ten countries according to how much they collect taxes and amount of government spending (bubble size = population, bubble color = unemployment percentage). When you play years between 1970 and 1999, you see countries changing locations. By analyzing the motion it's clear that since 1970 big European countries (UK, Germany, France, Italy, Spain) have adopted policies that make them more alike (bubbles moving near each other from starting positions) while Sweden all the time stands out with high taxes and government spending. Japan seems to be the other extreme with low taxes and government spending. Between 1993 and 1997 Spain turns red with a high unemployment rate (over 20%).<br /><br />I'm really astonished for several reasons:<br /><br /><ul><li>Motion chart as such is a very useful concept for tracking changes in time (it makes you <em>understand</em> the data, not just display it).</li><li>I was able to build the chart above and publish it here in two hours, that is, very quickly as a first-time-user (+one hour more for searching useful data).</li><li>Google motion chart works really well (try changing the chart above by using pop-up menus provided--it's a dynamic chart!).</li><li>No else BI provider offers this kind of visualization tool.</li><li>And it's available for free!</li></ul>Of course, the motion chart concept is someting we have been waiting for ever since Google bought <a href="http://www.gapminder.org/">GapMinder</a>. Now Google gives it for use for everybody. For example, you can load the motion chart with the products of your company and see their profiles changing over time.<br /><br />Here are the steps (I made) to publish the motion chart...<br /><br />1. I searched data from web with keywords "time series data". The macroeconomic data I found <a href="http://www.fgn.unisg.ch/eumacro/macrodata/macroeconomic-time-series.html">here</a>. I decided to limit my test in ten countries. I collected the text data into spreadsheet. One hour.<br /><br />2. The data was not in the right format for the motion chart (see example <a href="http://spreadsheets.google.com/pub?key=pCQbetd-CptE1ZQeQk8LoNw">here</a>). I used several methods to clear and pivot the data (not only in spreadsheet) until I was ready. One hour.<br /><br />3. I copy-pasted the data into a Google spreadsheet document (yes, you must start using <a href="http://docs.google.com/">Google Docs</a> to create and publish your motion chart). I inserted the motion chart gadget and it worked immediately. I still changed the format several times, because I needed (wanted) to learn how motion chart uses spreadsheet data. One hour.<br /><br />NOTE: If you want to use quarters, months, or weeks in the time dimension, convert them into dates--such as 2008-Q3 saved as "7/1/2008" or 3rd week of 2008 saved as "1/17/2008" (Thursday, the middle day of week). Dates must be in the US format (MM/DD/YYYY) and decimal numbers with a dot as the decimal separator.<br /><br />4. From the top-right gadget menu, I chose Publish Gadget, and it gave me html to put into a web page. One minute.<br /><br />Final thoughts... I'm not completely satisfied with the fact that in order to use the motion chart Google will host my (perhaps confidential) data. Nevertheless I'm going to explore the motion chart concept with my customers in the future, I'm sure.Janne Pyykköhttp://www.blogger.com/profile/09970278371612488984noreply@blogger.com12tag:blogger.com,1999:blog-5996843362424013237.post-85140338088186501962008-02-24T11:58:00.018+02:002008-02-24T15:27:46.864+02:00Google Chart API--why it's not important in BI?Just for curiosity I took a look at <a href="http://code.google.com/apis/chart/">Google Chart API</a>. Here are two sample charts I created with it.<br /><br /><br /><a href="http://1.bp.blogspot.com/_GY_NfovZtsk/R8FX9KapjtI/AAAAAAAAAQk/uDa_nuSrG9U/s1600-h/XY+sample.PNG"></a><a href="http://2.bp.blogspot.com/_GY_NfovZtsk/R8FZeaapjvI/AAAAAAAAAQ0/J7gZhhbOYL0/s1600-h/XY+sample.PNG"><img id="BLOGGER_PHOTO_ID_5170512226220084978" style="CURSOR: hand" alt="" src="http://2.bp.blogspot.com/_GY_NfovZtsk/R8FZeaapjvI/AAAAAAAAAQ0/J7gZhhbOYL0/s400/XY+sample.PNG" border="0" /></a><br /><br /><br /><a href="http://1.bp.blogspot.com/_GY_NfovZtsk/R8FYHKapjuI/AAAAAAAAAQs/N2GB52GBAOM/s1600-h/Line+sample.PNG"><img id="BLOGGER_PHOTO_ID_5170510727276498658" style="CURSOR: hand" alt="" src="http://1.bp.blogspot.com/_GY_NfovZtsk/R8FYHKapjuI/AAAAAAAAAQs/N2GB52GBAOM/s400/Line+sample.PNG" border="0" /></a><br /><br />Nice charts. As the data and axes are controlled separately, it's possible to compose:<br /><br /><ul><li>A chart where axis labels are qualitative--not quantitative--which is sometimes a preferred situation in an XY chart (good vs. poor etc)</li><li>A line chart where the line changes it's value several times within a month--not easily done in Excel or other BI tools having a direct correspondence between data values and categories displayed in the X axis</li></ul>As a BI consultant, should I be interested in Google Charts?<br /><br />- No. Google Chart API is a programmer tool to build larger systems.<br /><br />OK. When such a larger system appears, should I be interested then?<br /><br />- No again. All Google Charts are by nature static pictures (PNG files), without any means to add user interaction.<br /><br />For example, if you move the mouse cursor onto a symbol (or click the symbol) in the XY chart above (or onto the line chart), nothing happens. Simply said there's no mechanism to display what the symbols represent (or what is the exact value displayed in the line chart). It's not possible to program drilling-down either. And all these features are essential in business intelligence...<br /><br />Anyway, I had fun going through the Google Chart API. If you have a mind of a programmer, it's worth studying. Here are the urls which I used to create the XY chart and the line chart (broken into several lines for clarity).<br /><ul><li>XY:<br /><span style="font-size:85%;">http: //chart.apis.google.com/chart?cht=s<br />&chd=s:984sttvuvkQIBLKNCAIi,DEJPgq0uov17zwopQODS,AFLPTXaflptx159gsDrn<br />&chxt=x,y<br />&chtt=Google+Chart+API+-+Sample+XY<br />&chxl=0:Good|Poor1:Small|Large<br />&chs=250x250<br />&chm=o,ff000033,1,1,25<br />&chg=50,50,2,5</span></li><li>Line:<br /><span style="font-size:85%;">http: //chart.apis.google.com/chart?cht=lc<br />&chd=s:rpnpnosztwurrmmllgbZQOOORVURRSQROQSRSTKCBIKIKMLIIJIKLJKKNLNVWakiljhfedcbdcZbbZ, UVVUVVUUUVVUSSVVVXXYadfhjlmpokllmmliigdbbZZXVVUUUTU<br />&chco=5555ff,ff9933<br />&chtt=Google+Chart+API+-+Sample+Line<br />&chls=2.0,1.0,0.0<br />&chxt=x,x,y<br />&chxl=0:||Oct||Nov||Dec||Jan||Feb||Mar||Apr||May||1:||2007||||||2008||||||||||2:60|70|80|90|100<br />&chs=320x175<br />&chg=101,25,2,5<br />&chf=c,ls,0,f4f4ff,0.125,ffffff,0.125<br />&chdl=First|Second</span></li></ul>Janne Pyykköhttp://www.blogger.com/profile/09970278371612488984noreply@blogger.com1tag:blogger.com,1999:blog-5996843362424013237.post-53736193231124485912007-11-25T16:54:00.000+02:002007-11-25T17:13:35.655+02:00Adding value to sales data: willingness to use service againAmong other things this blog is about saving advantageous solutions for future use. After writing the previous posting <a href="http://jpbi.blogspot.com/2007/11/viewing-sales-data-by-using-xy-charts.html">Viewing sales data by using XY charts</a>, it came into my mind to compose a more general solution how to add value to your sales data.<br /><br />Let's start with the following sales table (same as before, products A-Z, etc):<br /><br /><span style="font-family:courier new;font-size:85%;">CREATE TABLE salesdata (<br /></span><span style="font-family:courier new;font-size:85%;">product VARCHAR(1),<br /></span><span style="font-family:courier new;font-size:85%;">customerno INTEGER,<br /></span><span style="font-family:courier new;font-size:85%;">market_area INTEGER,<br /></span><span style="font-family:courier new;font-size:85%;">day DATE,<br /></span><span style="font-family:courier new;font-size:85%;">quantity INTEGER);</span><br /><br />By using the <em>salesdata</em> table, we create a new table <em>salesdata_prod_cust_pair </em>that summarizes sales for each product-customer pair within a season.<br /><br /><span style="font-family:courier new;font-size:85%;">CREATE TABLE salesdata_prod_cust_pair (</span><br /><span style="font-family:courier new;font-size:85%;">product VARCHAR(1),</span><br /><span style="font-family:courier new;font-size:85%;">customerno INTEGER,</span><br /><span style="font-family:courier new;font-size:85%;">market_area INTEGER,</span><br /><span style="font-family:courier new;font-size:85%;">first_day_overall DATE,</span><br /><span style="font-family:courier new;font-size:85%;">first_day_season DATE,</span><br /><span style="font-family:courier new;font-size:85%;">next_day DATE,</span><br /><span style="font-family:courier new;font-size:85%;">count_season INTEGER,</span><br /><span style="font-family:courier new;font-size:85%;">quantity_season INTEGER,</span><br /><span style="font-family:courier new;font-size:85%;">new_customer_flag INTEGER,</span><br /><span style="font-family:courier new;font-size:85%;">use_again_flag INTEGER,</span><br /><span style="font-family:courier new;font-size:85%;">use_again_days_between INTEGER);</span><br /><br /><span style="font-size:85%;">The following SQL populates the table:</span><br /><br /><span style="font-family:courier new;font-size:85%;">/* Explore season 2007-Q1 */<br />INSERT INTO salesdata_prod_cust_pair</span><br /><span style="font-family:courier new;font-size:85%;">SELECT s.product, s.customerno, s.market_area,</span><br /><span style="font-family:courier new;font-size:85%;">MIN(overall.min_day) overall_first_day,</span><br /><span style="font-family:courier new;font-size:85%;">MIN(s.day) seasonal_first_day,</span><br /><span style="font-family:courier new;font-size:85%;">MIN(seasonal_next.next_day) next_day,</span><br /><span style="font-family:courier new;font-size:85%;">COUNT(*) count_season,</span><br /><span style="font-family:courier new;font-size:85%;">SUM(s.quantity) quantity_season,</span><br /><span style="font-family:courier new;font-size:85%;">1-SIGN(<span style="color:#ff0000;">DATEDIFF</span>(MIN(s.day),MIN(overall.min_day))) new_customer_flag,</span><br /><span style="font-family:courier new;font-size:85%;">SIGN(<span style="color:#ff0000;">DATEDIFF</span>(MIN(seasonal_next.next_day),MIN(s.day))) use_again_flag,</span><br /><span style="font-family:courier new;"><span style="font-size:85%;"><span style="color:#ff0000;">DATEDIFF</span>(MIN(seasonal_next.next_day),MIN(s.day)) use_again_days_between</span></span><br /><span style="font-family:courier new;font-size:85%;">FROM salesdata s</span><br /><br /><span style="font-family:courier new;font-size:85%;">LEFT OUTER JOIN(</span><br /><span style="font-family:courier new;font-size:85%;">SELECT nxt.product, nxt.customerno, MIN(nxt.day) next_day</span><br /><span style="font-family:courier new;font-size:85%;">FROM salesdata nxt,</span><br /><span style="font-family:courier new;font-size:85%;">(SELECT product, customerno, MIN(day) min_day</span><br /><span style="font-family:courier new;font-size:85%;">FROM salesdata</span><br /><span style="font-family:courier new;"><span style="font-size:85%;">WHERE<span style="color:#000000;"> day >=</span><span style="color:#ff0000;"> '2007-01-01'</span></span></span><br /><span style="font-family:courier new;font-size:85%;">GROUP BY product, customerno) q</span><br /><span style="font-family:courier new;font-size:85%;">WHERE nxt.customerno = q.customerno</span><br /><span style="font-family:courier new;font-size:85%;">AND nxt.product = q.product</span><br /><span style="font-family:courier new;font-size:85%;">AND nxt.day > q.min_day</span><br /><span style="font-family:courier new;font-size:85%;">GROUP BY nxt.product, nxt.customerno) seasonal_next</span><br /><span style="font-family:courier new;font-size:85%;">ON s.product = seasonal_next.product</span><br /><span style="font-family:courier new;font-size:85%;">AND s.customerno = seasonal_next.customerno,</span><br /><br /><span style="font-family:courier new;font-size:85%;">(SELECT product, customerno, MIN(day) min_day</span><br /><span style="font-family:courier new;font-size:85%;">FROM salesdata</span><br /><span style="font-family:courier new;font-size:85%;">GROUP BY product, customerno) overall</span><br /><br /><span style="font-family:courier new;font-size:85%;">WHERE s.product = overall.product</span><br /><span style="font-family:courier new;font-size:85%;">AND s.customerno = overall.customerno</span><br /><span style="font-family:courier new;"><span style="font-size:85%;">AND<span style="color:#000000;"> s.day BETWEEN</span><span style="color:#ff0000;"> '2007-01-01'<span style="color:#000000;"> AND</span> '2007-03-31'</span></span></span><br /><span style="font-family:courier new;font-size:85%;">GROUP BY s.product, s.customerno, s.market_area;</span><br /><br />Please note that you must change <span style="color:#ff0000;">red texts</span> according to your database. <em><span style="font-size:85%;">DATEDIFF</span>()</em> returns days between two dates. <em>2007-01-01 </em>and <em>2007-03-31</em> specify the season to explore; in this case 2007-Q1.<br /><br />Here's some data:<br /><br /><a href="http://4.bp.blogspot.com/_GY_NfovZtsk/R0lzyACIr2I/AAAAAAAAAMU/KxVBstROq3k/s1600-h/salesdata_pair.PNG"><img id="BLOGGER_PHOTO_ID_5136764152832372578" style="CURSOR: hand" alt="" src="http://4.bp.blogspot.com/_GY_NfovZtsk/R0lzyACIr2I/AAAAAAAAAMU/KxVBstROq3k/s400/salesdata_pair.PNG" border="0" /></a><br /><br />As you can see:<br /><ul><li>Product A was sold to Customer 4 first time in May 2006, so C-4 is not a new customer during season 2007-Q1, therefore new_customer_flag = 0.</li><li>Product A was sold to Customer 13 first time on January 5th, 2007, so C-13 is a new customer during season 2007-Q1, thus new_customer_flag = 1. This customer used the product/service again on March 8th, so use_again_flag = 1 and use_again_days_between = 62.</li></ul><p>OK. This was only the first part. By using the new <em>salesdata_prod_cust_pair </em>table, you can write in SQL:</p><p><span style="font-family:courier new;font-size:85%;">SELECT product,<br />SUM(quantity_season) "Total quantity of items sold",<br />100*SUM(new_customer_flag)/COUNT(*) "% new customers in this season",<br />100*SUM(use_again_flag*new_customer_flag)/SUM(new_customer_flag) "% willingness to use product/service again",<br />AVG(use_again_days_between/new_customer_flag) "Average days before using<br />product/service again"<br />FROM salesdata_prod_cust_pair<br />GROUP BY product;</span></p><p>.. and get the following results:</p><p><a href="http://1.bp.blogspot.com/_GY_NfovZtsk/R0l-1QCIr3I/AAAAAAAAAMc/Ppaa2fPiZls/s1600-h/salesdata_last_table.PNG"><img id="BLOGGER_PHOTO_ID_5136776303294852978" style="CURSOR: hand" alt="" src="http://1.bp.blogspot.com/_GY_NfovZtsk/R0l-1QCIr3I/AAAAAAAAAMc/Ppaa2fPiZls/s400/salesdata_last_table.PNG" border="0" /></a></p><p>See the last two columns! For example the <em>willingness to use product/service again</em> is an excellent indicator to monitor how much your customers trust your products.</p><p>By using an XY chart you can visualize the data:</p><p><a href="http://4.bp.blogspot.com/_GY_NfovZtsk/R0mCoACIr4I/AAAAAAAAAMk/b0INWw9YQcg/s1600-h/xy+willingness.PNG"><img id="BLOGGER_PHOTO_ID_5136780473708097410" style="CURSOR: hand" alt="" src="http://4.bp.blogspot.com/_GY_NfovZtsk/R0mCoACIr4I/AAAAAAAAAMk/b0INWw9YQcg/s320/xy+willingness.PNG" border="0" /></a></p><p>Let's analyze:</p><ul><li>There must be some problem in products U and B. Marketing might be ok (42% new customers), but once used the first time, only 10 % of customers are willing to use the service again.</li><li>The product C is in the other extreme. When customers use it, they probably love to use it again. Perhaps the teams responsible for products U and B can learn something from the team C?</li></ul><p>I think I'm done with the sales data for now. Next time, something else.</p>Janne Pyykköhttp://www.blogger.com/profile/09970278371612488984noreply@blogger.com1tag:blogger.com,1999:blog-5996843362424013237.post-70277299797755210092007-11-04T00:20:00.000+02:002007-11-04T21:27:46.203+02:00Viewing sales data by using XY chartsAn XY chart (scatter plot) can express more than a typical bar or line chart. Here's an example how you can utilize typical sales data and create two interesting XY charts that help your decision making processes.<br /><br />Suppose you have a typical sales table:<br /><span style="font-family:courier new;font-size:85%;"></span><br /><span style="font-family:courier new;font-size:85%;">CREATE <span style="font-family:courier new;">TABLE salesdata (</span></span><br /><span style="font-family:courier new;font-size:85%;">product VARCHAR(1),</span><br /><span style="font-family:courier new;font-size:85%;">customerno INTEGER,</span><br /><span style="font-family:courier new;font-size:85%;">market_area INTEGER,</span><br /><span style="font-family:courier new;font-size:85%;">day DATE,</span><br /><span style="font-family:courier new;font-size:85%;">quantity INTEGER);</span><br /><br />.. that contains row-level sales data like this (product names A-Z, market areas 1-12):<br /><a href="http://4.bp.blogspot.com/_GY_NfovZtsk/Ry0H1z_Z9HI/AAAAAAAAAME/JD_-qSxHKBU/s1600-h/salesdata.PNG"><img id="BLOGGER_PHOTO_ID_5128764171715474546" style="CURSOR: hand" alt="" src="http://4.bp.blogspot.com/_GY_NfovZtsk/Ry0H1z_Z9HI/AAAAAAAAAME/JD_-qSxHKBU/s320/salesdata.PNG" border="0" /></a><br /><br />By using the data, you usually create charts like this:<br /><br /><a href="http://2.bp.blogspot.com/_GY_NfovZtsk/Ryz1Sj_Z9CI/AAAAAAAAALc/uRyKfFlMKps/s1600-h/monthly+graph.PNG"><img id="BLOGGER_PHOTO_ID_5128743774915785762" style="CURSOR: hand" alt="" src="http://2.bp.blogspot.com/_GY_NfovZtsk/Ryz1Sj_Z9CI/AAAAAAAAALc/uRyKfFlMKps/s320/monthly+graph.PNG" border="0" /></a><br /><br />.. and for sure this chart is informative in many ways.<br /><br />Now suppose you are interested in knowing more about products and market areas. For example: what is the average quantity of items sold in this season (Y), and how many of the customers bought the product for the first time (X).<br /><br />You can get the data by using the two following SQL queries (this season = 2007):<br /><br /><span style="font-family:courier new;font-size:85%;">/* XY for products */<br />SELECT s.product,<br />AVG(s.quantity) "Average quantity of items sold",<br />100*SUM(1-SIGN(ABS(s.day-q.min_day)))/COUNT(*) "% new customers in this season",<br />SUM(s.quantity) "Total quantity of items sold"</span><br /><span style="font-family:courier new;font-size:85%;">FROM salesdata s,<br />(SELECT product, customerno, MIN(day) min_day<br />FROM salesdata<br />GROUP BY product, customerno) q<br />WHERE s.product = q.product<br />AND s.customerno = q.customerno<br />AND s.day >= '2007-01-01'<br />GROUP BY s.product;</span><br /><br /><span style="font-family:courier new;font-size:85%;">/* XY for market areas */<br />SELECT s.market_area,<br />AVG(s.quantity) "Average quantity of items sold",<br />100*SUM(1-SIGN(ABS(s.day-q.min_day)))/COUNT(*) "% new customers in this season",</span><br /><span style="font-family:courier new;font-size:85%;">SUM(s.quantity) "Total quantity sold in area"<br />FROM salesdata s,<br />(SELECT product, customerno, MIN(day) min_day<br />FROM salesdata<br />GROUP BY product, customerno) q<br />WHERE s.product = q.product<br />AND s.customerno = q.customerno<br />AND s.day >= '2007-01-01'<br />GROUP BY s.market_area;</span><br /><br />Now the charts:<br /><br /><a href="http://2.bp.blogspot.com/_GY_NfovZtsk/Ryz5tj_Z9FI/AAAAAAAAAL0/qgwvEpwAPpE/s1600-h/xy+products.PNG"><img id="BLOGGER_PHOTO_ID_5128748636818764882" style="CURSOR: hand" alt="" src="http://2.bp.blogspot.com/_GY_NfovZtsk/Ryz5tj_Z9FI/AAAAAAAAAL0/qgwvEpwAPpE/s320/xy+products.PNG" border="0" /></a><br /><br /><a href="http://3.bp.blogspot.com/_GY_NfovZtsk/Ryz5lz_Z9EI/AAAAAAAAALs/K8X76nEDTP8/s1600-h/xy+areas.PNG"><img id="BLOGGER_PHOTO_ID_5128748503674778690" style="CURSOR: hand" alt="" src="http://3.bp.blogspot.com/_GY_NfovZtsk/Ryz5lz_Z9EI/AAAAAAAAALs/K8X76nEDTP8/s320/xy+areas.PNG" border="0" /></a><br /><br />Let's analyze:<br /><ul><li>Products clearly have different profiles. Product F is favored by old customers, while products L and T have recently found many new customers. In a single sales transaction, product U is sold by average nearly twice as much as D.</li><li>Symbol size is the total quantity sold. There are no big differences in the sales of different products. More difference exist in various market areas: area 1 is the most important, while areas 9, 10, and 11 are the smallest. (I created the data by using a random function.)</li><li>All market areas have almost the same "average quantity of items sold". The market areas differ more in X axis: in area 1 only 18% are new customers, while in areas 10 and 12 almost half of the customers are new. Perhaps the company just started marketing in areas 10 and 12?</li></ul><p>All this is interesting -- no less so because the basic line chart cannot express this knowledge.</p><p>Nevertheless it must be said that these charts don't yet tell anything about the prospects in different market areas. Adding facts about market area size (for each product) would help your decision making where to target your marketing in the next season.</p><p>Therefore, don't be satisfied with line and bar charts, if you can improve your decision making by using XY...</p>Janne Pyykköhttp://www.blogger.com/profile/09970278371612488984noreply@blogger.com3tag:blogger.com,1999:blog-5996843362424013237.post-12765164322578650342007-10-23T21:29:00.000+02:002007-10-24T08:15:18.234+02:00Age pyramid modelsIn the recent days I've read with great interests <a href="http://charts.jorgecamoes.com/category/blog/">Jorge Camoes' blog</a> about his <a href="http://charts.jorgecamoes.com/how-to-create-an-excel-dashboard/">Excel Demographic Dashboard</a> and attempts to enhance it with Crystal Xcelsius. The Excel dashboard even contains two animated reports that help you see how the age structure in various countries and continents will change between years 1995 and 2050.<br /><br />First, thanks for the information! The dashboard opened my eyes to see that the trend of growing average age is valid everywhere, including African countries (not only in Europe).<br /><br />Second, thanks for a good sample! This well-documented test helps us compare BI softwares against each other.<br /><br />Jorge's <a href="http://charts.jorgecamoes.com/xcelsius-dashboard-the-population-pyramid/">recent post</a> discusses different approaches to create population pyramids with Excel (XY chart) and Crystal Xcelsius (stacked bar). Excel approach appears to me more flexible, because it allows you to draw several lines (several countries) into same age pyramid. Xcelsius falls short because Xcelsius-XY chart doesn't draw lines (only data points).<br /><br />Well, let me add one more approach to the age pyramid problem.<br /><br /><a href="http://3.bp.blogspot.com/_GY_NfovZtsk/Rx5hRuGcpmI/AAAAAAAAALU/FTxbx_FGHm4/s1600-h/agepyramid4models.PNG"><img id="BLOGGER_PHOTO_ID_5124640383054947938" style="CURSOR: hand" alt="" src="http://3.bp.blogspot.com/_GY_NfovZtsk/Rx5hRuGcpmI/AAAAAAAAALU/FTxbx_FGHm4/s400/agepyramid4models.PNG" border="0" /></a><br /><br />I created the four age pyramid models by using (not XY or stacked bar but) the <em>Sheet tool</em> in Voyant. The software allows you to format a cell (to be precise, a column of a sheet) to display a bar or a line (instead of the number it contains).<br /><br />Benefits of this approach? It's the flexibility to mix numbers and graphics together. For example, if you need more space for bars or lines, numbers will immediately move to the right, because they are part of the same sheet.<br /><br />OK.<br /><br />From now on, a link to Jorge Camoes' BI Blog is provided under my link list.Janne Pyykköhttp://www.blogger.com/profile/09970278371612488984noreply@blogger.com1tag:blogger.com,1999:blog-5996843362424013237.post-4322419656686752682007-09-22T20:10:00.000+02:002007-09-22T20:34:28.480+02:00Scatter plot found in Helsingin SanomatAs I have criticized Helsingin Sanomat, the largest newspaper in Finland, for not publishing any scatter plot charts (XY charts), here's one!<br /><br /><a href="http://3.bp.blogspot.com/_GY_NfovZtsk/RvVau-GcphI/AAAAAAAAAK0/21clzAe3dhA/s1600-h/hsxy20070921.JPG"><img id="BLOGGER_PHOTO_ID_5113092714939590162" style="CURSOR: hand" alt="" src="http://3.bp.blogspot.com/_GY_NfovZtsk/RvVau-GcphI/AAAAAAAAAK0/21clzAe3dhA/s400/hsxy20070921.JPG" border="0" /></a><br /><br />It was published on Friday 21st September 2007 (the first XY during my blog).<br /><br />The data: Number of economic reforms made (x) and economic growth rate (y) since 1989 when communism collapsed in Eastern Europe. Blue circle = European Union member state.<br /><br />It's easy to see that it pays off for a state to join EU.<br /><br />That's a powerful message displayed clearly. No other graph would have done that this well.Janne Pyykköhttp://www.blogger.com/profile/09970278371612488984noreply@blogger.com0tag:blogger.com,1999:blog-5996843362424013237.post-32193806121055053332007-08-12T18:34:00.000+03:002007-08-12T22:51:15.903+03:00Observations about Edward Tufte's classic The Visual Display of Quantitative Information<a href="http://2.bp.blogspot.com/_GY_NfovZtsk/Rr88tkWSAkI/AAAAAAAAAKc/khzVpXOgzBQ/s1600-h/bookcover.PNG"><img id="BLOGGER_PHOTO_ID_5097860056755601986" style="FLOAT: left; MARGIN: 0px 10px 10px 0px; CURSOR: hand" alt="" src="http://2.bp.blogspot.com/_GY_NfovZtsk/Rr88tkWSAkI/AAAAAAAAAKc/khzVpXOgzBQ/s200/bookcover.PNG" border="0" /></a>More than two months ago I read a book The Visual Display of Quantitative Information by <a href="http://en.wikipedia.org/wiki/Edward_Tufte">Edward Tufte</a>.<br /><br />Many already know this classic (1st edition 1981, 2nd 2001) that introduces the first theoretical view of statistical graphics. Instead of writing a full review, I will tell what I liked most.<br /><br />In the end of this post, you'll also find a comparison of this book and Stephen Few's Information Dashboard Design.<br /><br /><br /><strong>1. History</strong><br /><br />I enjoyed reading where all started. We don't know who invented <a href="http://en.wikipedia.org/wiki/Traditional_Chinese">traditional Chinese characters</a>. We know only according to legends that the inventor of <a href="http://en.wikipedia.org/wiki/Greek_alphabet">Greec alphabet</a> was <a href="http://en.wikipedia.org/wiki/Cadmus">Cadmus</a>. But for sure we know who invented line and bar charts. He was <a href="http://en.wikipedia.org/wiki/William_Playfair">William Playfair</a> (1759-1823). Here's one of his piece of art (its copyright has expired):<br /><br /><a href="http://3.bp.blogspot.com/_GY_NfovZtsk/Rr8ZB0WSAiI/AAAAAAAAAKM/URXkOEMv4gQ/s1600-h/playfair.PNG"><img id="BLOGGER_PHOTO_ID_5097820822229352994" style="CURSOR: hand" alt="" src="http://3.bp.blogspot.com/_GY_NfovZtsk/Rr8ZB0WSAiI/AAAAAAAAAKM/URXkOEMv4gQ/s320/playfair.PNG" border="0" /></a><br /><br />For Playfair, graphics were preferable to tables because graphics showed the shape of the data in a comparative perspective. The line chart easily contained data of hundred years displaying for example imports and exports or the national debt of Britain. Colonial wars caused rises and falls thus proving that the government policy has its economic consequences.<br /><br /><strong>2. More history</strong><br /><br />The book contains many other examples of graphical art too. As in early days every graphic was cut into metal or stone surface as a mirror image, surely each piece of graphic is more valuable and special compared to our moders graphics. Here's a masterpiece from <a href="http://en.wikipedia.org/wiki/Charles_Minard">Charles Joseph Minard</a> (its copyright has expired):<br /><br /><a href="http://upload.wikimedia.org/wikipedia/commons/thumb/2/29/Minard.png/800px-Minard.png"><img style="WIDTH: 320px; CURSOR: hand" alt="" src="http://upload.wikimedia.org/wikipedia/commons/thumb/2/29/Minard.png/800px-Minard.png" border="0" /></a><br /><br />The well-known graphic shows the terrible fate of Napoleon's army in Russia during 1812-1813. Starting from Poland on the left the size of the army is 422,000. While going towards east the width of the gray band indicates the diminishing size of the army at each place on the map. In Moscow there are 100,000 men left. Going back to Poland the dark line becomes thinner and thinner thus displaying devastating losses. In Tufte's words: <em>It may well be the best statistical graphic ever drawn</em>.<br /><br /><strong>3. Scatterplots</strong><br /><br />Tufte uses 23 pages of 197 (11.7%) to talk about XY charts. I'm impressed about the difference he makes between XY and other charts, because I feel the same. Tufte's words: <em>[...] the scatterplot and its variants ... [are] the greatest of all graphical designs. It links at least two variables, encouraging and even imploring the viewer to assess the possible causal relationship between the plotted variables</em>.<br /><br />In fact, when I read Helsingin Sanomat, the largest newspaper in Finland, I from time to time see it losing the possibility to use XY. The following image was published by the newspaper about the population growth rate in twenty cities of Finland (translated in English).<br /><br /><a href="http://1.bp.blogspot.com/_GY_NfovZtsk/Rr8X3UWSAfI/AAAAAAAAAJ0/lFXuugy7DIo/s1600-h/bar+hesari.PNG"><img id="BLOGGER_PHOTO_ID_5097819542329098738" style="CURSOR: hand" alt="" src="http://1.bp.blogspot.com/_GY_NfovZtsk/Rr8X3UWSAfI/AAAAAAAAAJ0/lFXuugy7DIo/s320/bar+hesari.PNG" border="0" /></a><br /><br />Here is my view of the same data by using a scatterplot:<br /><br /><a href="http://2.bp.blogspot.com/_GY_NfovZtsk/Rr8YBkWSAgI/AAAAAAAAAJ8/X8I7OuBxbTg/s1600-h/xy+hesari.PNG"><img id="BLOGGER_PHOTO_ID_5097819718422757890" style="CURSOR: hand" alt="" src="http://2.bp.blogspot.com/_GY_NfovZtsk/Rr8YBkWSAgI/AAAAAAAAAJ8/X8I7OuBxbTg/s320/xy+hesari.PNG" border="0" /></a><br /><br />Some cities grow fast (top-right corner) both because of natural population growth and positive net migration, while for example the lowest city Kajaani has a problem with negative net migration (-5%). A national decision-maker would appreciate this XY chart, as it clearly shows which cities are alike and which aren't.<br /><br /><strong>4. Small multiples</strong><br /><br />Small multiples is something I haven't paid attention, when I was a programmer. The idea is great. See how much information is contained in the following set of charts.<br /><br /><a href="http://3.bp.blogspot.com/_GY_NfovZtsk/Rr8eG0WSAjI/AAAAAAAAAKU/yVuz2X63p7A/s1600-h/multiples.jpg"><img id="BLOGGER_PHOTO_ID_5097826405686837810" style="CURSOR: hand" alt="" src="http://3.bp.blogspot.com/_GY_NfovZtsk/Rr8eG0WSAjI/AAAAAAAAAKU/yVuz2X63p7A/s400/multiples.jpg" border="0" /></a><br /><br />Unfortunately, to create small multiples with a modern charting application is far from straightforward. For example, when composing the sample image above, I had to size and position six charts precisely to form a satisfactory result. Another bad point: Individual charts tend to have a different automatic scale in Y axis, which is a problem, when you quickly try to get grip of the data.<br /><br /><strong>5. New concepts introduced<br /></strong><br />To me it's fun to learn what concepts others have invented. In this book Tufte introduces a couple of them, of which the <em>data-ink ratio</em> is most well-known: above all else show the data, maximize the data-ink ratio, erase non-data-ink and redundant data-ink, revise and edit.<br /><br />Other concepts such as lie factor and data density I also find interesting. <em>Lie factor</em>: show data variation, not design variation. <em>Data density</em>: the small multiples chart above is an example of a high-information graphics = you don’t have to repeat titles or legends for every individual chart.<br /><br /><strong>Data visualization books in comparison</strong><br /><br />As I have previously reviewed Stephen Few's Information Dashboard Design [<a href="http://jpbi.blogspot.com/2007/04/great-book-about-dashboard-design-by.html">link</a>] that also discusses about data visualization, here's my observations how these great books differ:<br /><br /><a href="http://2.bp.blogspot.com/_GY_NfovZtsk/Rr9hjkWSAmI/AAAAAAAAAKs/m8A0aFk3ok4/s1600-h/bookcomparison.PNG"><img id="BLOGGER_PHOTO_ID_5097900566887137890" style="CURSOR: hand" alt="" src="http://2.bp.blogspot.com/_GY_NfovZtsk/Rr9hjkWSAmI/AAAAAAAAAKs/m8A0aFk3ok4/s400/bookcomparison.PNG" border="0" /></a><br /><br />By the way, there are newer books from Tufte available such as Beautiful Evidence, but I haven’t read it yet.Janne Pyykköhttp://www.blogger.com/profile/09970278371612488984noreply@blogger.com3tag:blogger.com,1999:blog-5996843362424013237.post-23707376463906599642007-07-30T18:48:00.001+03:002007-07-30T20:05:55.083+03:00How to do enhanced project chartsMy <a href="http://jpbi.blogspot.com/2007/07/viewing-project-data-new-options.html">idea about enhanced project charting</a> raised some <a href="http://www.websitetoolbox.com/tool/post/sfew/vpost?id=2047951">discussion</a> in the web forum provided by <a href="http://www.perceptualedge.com/">Perceptual Edge</a>, Stephen Few's site.<br /><br />Just to convince you it wasn't Excel (!), here's the dialog box to specify the color scale in Voyant:<br /><br /><a href="http://1.bp.blogspot.com/_GY_NfovZtsk/Rq4IkkWSAbI/AAAAAAAAAJU/4Ezm-MFWUb0/s1600-h/proj2_a.PNG"><img id="BLOGGER_PHOTO_ID_5093017652928053682" style="CURSOR: hand" alt="" src="http://1.bp.blogspot.com/_GY_NfovZtsk/Rq4IkkWSAbI/AAAAAAAAAJU/4Ezm-MFWUb0/s320/proj2_a.PNG" border="0" /></a><br /><br />Actually the process of creating such project charts in Voyant needs only three steps:<br /><br />1. Create a projdata table and populate it with sample data (or use a corresponding table in your database).<br /><br /><span style="font-family:courier new;font-size:85%;">CREATE TABLE projdata (<br /></span><span style="font-family:courier new;font-size:85%;">project_name VARCHAR(20),<br /></span><span style="font-family:courier new;font-size:85%;">day DATE,<br /></span><span style="font-family:courier new;font-size:85%;">hours DECIMAL(20,2));</span><br /><br />After that it takes only 5 minutes:<br />2. By using a Sheet chart, specify four dimensions:<br /><br /><span style="font-family:courier new;font-size:85%;">Measure: Hours = SUM(hours)</span><br /><span style="font-family:courier new;font-size:85%;">Rows: Project = project_name</span><br /><span style="font-family:courier new;font-size:85%;">Columns, outer dimension: Year = {fn YEAR({fn TIMESTAMPADD(SQL_TSI_DAY,-{fn MOD({fn DAYOFWEEK(day)}+5,7)}+3,day)})}</span><br /><span style="font-family:courier new;font-size:85%;">Columns, inner dimension: Week = {fn FLOOR(1+({fn DAYOFYEAR({fn TIMESTAMPADD(SQL_TSI_DAY,-{fn MOD({fn DAYOFWEEK(day)}+5,7)}+3,day)})}-1)/7)}</span><br /><br />NOTE: The expressions above apply in SQL Server, MySQL, and other ODBC compatible databases. For other suitable expressions and how to limit the time frame, see my posting about <a href="http://jpbi.blogspot.com/2007/04/monitoring-changes-over-time-my-best.html">monitoring changes over time</a>.<br /><br />3. Use the Sheet Cell Format dialog box (presented in the beginning): first for the whole chart, and second for the total row.<br /><br />For the whole sheet, select <em>Min & max determined automatically</em>, so the color scale adapts to the greatest and least number of hours found from the data (excluding the total row).<br /><br /><a href="http://2.bp.blogspot.com/_GY_NfovZtsk/Rq4aM0WSAeI/AAAAAAAAAJs/od7yCy9Mn8A/s1600-h/proj2_c.PNG"><img id="BLOGGER_PHOTO_ID_5093037036115460578" style="CURSOR: hand" alt="" src="http://2.bp.blogspot.com/_GY_NfovZtsk/Rq4aM0WSAeI/AAAAAAAAAJs/od7yCy9Mn8A/s320/proj2_c.PNG" border="0" /></a><br /><br />For the total row, it's necessary to deselect <em>Exclude totals, formulas, and cumulative numbers</em>, so that the color scale adapts to the numbers presented in the total row.<br /><br />It's also possible to use a tool tip to show the exact work hours by moving a mouse cursor over a cell of interest, as in the following image:<br /><br /><a href="http://1.bp.blogspot.com/_GY_NfovZtsk/Rq4I3kWSAdI/AAAAAAAAAJk/LPd2XpmEanc/s1600-h/proj2_b.PNG"><img id="BLOGGER_PHOTO_ID_5093017979345568210" style="CURSOR: hand" alt="" src="http://1.bp.blogspot.com/_GY_NfovZtsk/Rq4I3kWSAdI/AAAAAAAAAJk/LPd2XpmEanc/s400/proj2_b.PNG" border="0" /></a><br /><br />I hope we can soon see the color scale feature available in other OLAP tools too.Janne Pyykköhttp://www.blogger.com/profile/09970278371612488984noreply@blogger.com1tag:blogger.com,1999:blog-5996843362424013237.post-65443324430265447782007-07-25T19:02:00.000+03:002007-07-30T20:48:55.765+03:00Viewing project data--new options availableIn this posting, I will explore different ways to view project data. Some options have become available just recently.<br /><br />Let's start with the well-known <a href="http://en.wikipedia.org/wiki/Gantt_Chart">gantt chart</a>. It's a great tool for project scheduling.<br /><div><br /><div><a href="http://4.bp.blogspot.com/_GY_NfovZtsk/Rqd0bEWSAVI/AAAAAAAAAIk/Pcm1w67jxlE/s1600-h/projects_1.PNG"><img id="BLOGGER_PHOTO_ID_5091165912138121554" style="CURSOR: hand" alt="" src="http://4.bp.blogspot.com/_GY_NfovZtsk/Rqd0bEWSAVI/AAAAAAAAAIk/Pcm1w67jxlE/s320/projects_1.PNG" border="0" /></a><br /></div><div>The gantt chart displays the start and finish date for each project. For example, the chart above shows that Project P is the longest project, starting on November 2006 finishing on August 2007.<br /></div><div>However, a gantt chart doesn't show us how much resources each project requires.<br /></div><div>Let's try a line chart with the same data, work hours displayed on a weekly basis.<br /></div><div><a href="http://4.bp.blogspot.com/_GY_NfovZtsk/Rqd0iEWSAWI/AAAAAAAAAIs/h8Yl1WmzuXM/s1600-h/projects_2.PNG"><img id="BLOGGER_PHOTO_ID_5091166032397205858" style="CURSOR: hand" alt="" src="http://4.bp.blogspot.com/_GY_NfovZtsk/Rqd0iEWSAWI/AAAAAAAAAIs/h8Yl1WmzuXM/s320/projects_2.PNG" border="0" /></a></div><div></div><div>Now we can see that project P (the longest project) doesn't require too much resources. Project K requires much more and so does A and G. You can also see the timing of these projects, that is, when the most work is done. This is important information that the gantt chart didn't tell us!<br /></div><div>OK, let's try the area chart.<br /></div><div><a href="http://1.bp.blogspot.com/_GY_NfovZtsk/Rqd0nUWSAXI/AAAAAAAAAI0/W5l8W4q5TTw/s1600-h/projects_3.PNG"><img id="BLOGGER_PHOTO_ID_5091166122591519090" style="CURSOR: hand" alt="" src="http://1.bp.blogspot.com/_GY_NfovZtsk/Rqd0nUWSAXI/AAAAAAAAAI0/W5l8W4q5TTw/s320/projects_3.PNG" border="0" /></a></div><div>Now you can see the total resources required for all projects.<br /></div><div>However, the line and area charts are hard to read when you have many projects. The colors of the 15 projects above are so-so distinguishable. If you have hundreds of projects, the line and area charts become useless.</div><div></div><div>Fortunately we have some new options available. Microsoft Excel calls this option "color scale" and so does Voyant's new version. Here's the same project data with Voyant.</div><div><br /><a href="http://1.bp.blogspot.com/_GY_NfovZtsk/Rqd1tUWSAYI/AAAAAAAAAI8/SNYRoSjCDos/s1600-h/projects_4.PNG"><img id="BLOGGER_PHOTO_ID_5091167325182361986" style="CURSOR: hand" alt="" src="http://1.bp.blogspot.com/_GY_NfovZtsk/Rqd1tUWSAYI/AAAAAAAAAI8/SNYRoSjCDos/s400/projects_4.PNG" border="0" /></a><br /></div><div>It looks like the original gantt chart!<br /></div><div>But instead of having just solid blue bars, now each blue bar contains much more information. First, there are weekly hours available. Second (and what is most important), the intensity of color tells us directly which project is hot and when.<br /></div><div>If there are hundreds of projects, this kind of display still works (you just scroll down).<br /></div><div>There's also an alternative to hide the numbers.<br /></div><div><a href="http://1.bp.blogspot.com/_GY_NfovZtsk/Rqd13UWSAZI/AAAAAAAAAJE/y7ejZOcvCtk/s1600-h/projects_5.PNG"></a></div><div></div><a href="http://3.bp.blogspot.com/_GY_NfovZtsk/RqeHR0WSAaI/AAAAAAAAAJM/JAteHz1G4uU/s1600-h/projects_6.PNG"><img id="BLOGGER_PHOTO_ID_5091186643945259426" style="CURSOR: hand" alt="" src="http://3.bp.blogspot.com/_GY_NfovZtsk/RqeHR0WSAaI/AAAAAAAAAJM/JAteHz1G4uU/s400/projects_6.PNG" border="0" /></a><br /><div></div><div>In the image above, the total row displays the overall resources required (red color scale).</div></div>Janne Pyykköhttp://www.blogger.com/profile/09970278371612488984noreply@blogger.com3tag:blogger.com,1999:blog-5996843362424013237.post-71171079257693965562007-06-05T23:00:00.000+03:002007-06-06T08:10:06.390+03:003D calculation exampleI have been asked how to create 3D rotation as in <a href="http://jpbi.blogspot.com/2007/04/crystal-xcelcius-playful-3d-demo.html">the posting about Crystal Xcelsius</a>.<br /><div><div></div><br /><div>When x, y, and z go like in the image below, and x = longitude 0, and y = longitude 90 degrees, then for arbitrary latitude and longitude:</div><div> </div><div></div><div></div><div>x = r * sin(longitude) * cos(latitude)</div><div>y = r * cos(longitude) * cos(latitude)</div><div>z = r * sin(latitude)</div><div></div><br /><div>.. where r = radius of ball.</div><div></div><div><a href="http://2.bp.blogspot.com/_GY_NfovZtsk/RmXEW_2X_OI/AAAAAAAAAIM/PsDjj6biD10/s1600-h/3d+ball+small.PNG"><img id="BLOGGER_PHOTO_ID_5072676454678396130" style="CURSOR: hand" alt="" src="http://2.bp.blogspot.com/_GY_NfovZtsk/RmXEW_2X_OI/AAAAAAAAAIM/PsDjj6biD10/s320/3d+ball+small.PNG" border="0" /></a> </div><div></div><div></div><div>When you add the eye (watching point) to the image, then for arbitrary point <strong>a</strong> in coordinates (xa, ya, za), you can calculate point <strong>a'</strong> (xa', ya') in the plane between eye and ball.<br /></div><div><a href="http://2.bp.blogspot.com/_GY_NfovZtsk/RmXEe_2X_PI/AAAAAAAAAIU/TOYMVb3aZS8/s1600-h/3d+ball+eye+small.PNG"><img id="BLOGGER_PHOTO_ID_5072676592117349618" style="CURSOR: hand" alt="" src="http://2.bp.blogspot.com/_GY_NfovZtsk/RmXEe_2X_PI/AAAAAAAAAIU/TOYMVb3aZS8/s320/3d+ball+eye+small.PNG" border="0" /></a></div><div></div><div>The next image helps understand how you calculate ya' in the plane:</div><div></div><div><a href="http://3.bp.blogspot.com/_GY_NfovZtsk/RmXK3P2X_QI/AAAAAAAAAIc/IjebdsR6eq0/s1600-h/final+xy+small.PNG"><img id="BLOGGER_PHOTO_ID_5072683605798944002" style="CURSOR: hand" alt="" src="http://3.bp.blogspot.com/_GY_NfovZtsk/RmXK3P2X_QI/AAAAAAAAAIc/IjebdsR6eq0/s320/final+xy+small.PNG" border="0" /></a> </div><div>xa' = ya/(d-xa)<br />ya' = za/(d-xa)</div><div></div><div>.. or if you want to put little less perspective, you can do:</div><div></div><div>xa' = arctan(ya/(d-xa))</div><div>ya' = arctan(za/(d-xa))</div><div></div><div>The point <strong>a'</strong> (xa', ya') is input for the XY chart.</div></div>Janne Pyykköhttp://www.blogger.com/profile/09970278371612488984noreply@blogger.com2tag:blogger.com,1999:blog-5996843362424013237.post-88681026009488803742007-05-23T19:28:00.000+03:002007-05-23T18:07:20.380+03:00SQL trick for flattening a parent-child table and vice versaAmong other things this blog is about saving tricks for future use. This SQL trick I used in my previous example of <a href="http://jpbi.blogspot.com/2007/05/xy-chart-and-hierarchical-graphs.html">hierarchical organization structure</a> = How to flatten a parent-child table.<br /><br />Suppose you have the following parent-child table for your hierarchy:<br /><br /><span style="font-family:courier new;font-size:85%;">CREATE TABLE hierarchy_parent_child (<br />node_id INTEGER NOT NULL,<br />parent_id INTEGER,<br />node_name VARCHAR(40)<br />);</span><br /><br />.. and the table contains data like this (please note NULL in root parent_id):<br /><br /><a href="http://3.bp.blogspot.com/_GY_NfovZtsk/RlRO0Sjf9UI/AAAAAAAAAIE/CfyCKcQJCdo/s1600-h/parent_child.PNG"><img id="BLOGGER_PHOTO_ID_5067762140939547970" style="CURSOR: hand" alt="" src="http://3.bp.blogspot.com/_GY_NfovZtsk/RlRO0Sjf9UI/AAAAAAAAAIE/CfyCKcQJCdo/s200/parent_child.PNG" border="0" /></a><br /><br />.. and you want to change it into flat format, here's the SQL:<br /><br /><span style="font-family:courier new;font-size:85%;">SELECT<br />lev01.node_id id_01, lev01.node_name name_01,<br />lev02.node_id id_02, lev02.node_name name_02,<br />lev03.node_id id_03, lev03.node_name name_03,<br />lev04.node_id id_04, lev04.node_name name_04,<br />lev05.node_id id_05, lev05.node_name name_05,<br />lev06.node_id id_06, lev06.node_name name_06,<br />lev07.node_id id_07, lev07.node_name name_07,<br />lev08.node_id id_08, lev08.node_name name_08,<br />lev09.node_id id_09, lev09.node_name name_09,<br />lev10.node_id id_10, lev10.node_name name_10<br />FROM hierarchy_parent_child lev01<br />LEFT OUTER JOIN hierarchy_parent_child lev02 ON lev01.node_id = lev02.parent_id<br />LEFT OUTER JOIN hierarchy_parent_child lev03 ON lev02.node_id = lev03.parent_id<br />LEFT OUTER JOIN hierarchy_parent_child lev04 ON lev03.node_id = lev04.parent_id<br />LEFT OUTER JOIN hierarchy_parent_child lev05 ON lev04.node_id = lev05.parent_id<br />LEFT OUTER JOIN hierarchy_parent_child lev06 ON lev05.node_id = lev06.parent_id<br />LEFT OUTER JOIN hierarchy_parent_child lev07 ON lev06.node_id = lev07.parent_id<br />LEFT OUTER JOIN hierarchy_parent_child lev08 ON lev07.node_id = lev08.parent_id<br />LEFT OUTER JOIN hierarchy_parent_child lev09 ON lev08.node_id = lev09.parent_id<br />LEFT OUTER JOIN hierarchy_parent_child lev10 ON lev09.node_id = lev10.parent_id<br />WHERE lev01.parent_id IS NULL;</span><br /><br />.. then it returns rows like this:<br /><br /><a href="http://2.bp.blogspot.com/_GY_NfovZtsk/RlROrCjf9TI/AAAAAAAAAH8/kB7sqh-4HBI/s1600-h/hierarchy_flat.PNG"><img id="BLOGGER_PHOTO_ID_5067761982025758002" style="CURSOR: hand" alt="" src="http://2.bp.blogspot.com/_GY_NfovZtsk/RlROrCjf9TI/AAAAAAAAAH8/kB7sqh-4HBI/s400/hierarchy_flat.PNG" border="0" /></a><br /><br />The example above works for ten levels but it's easy to add more levels by repeating the pattern.<br /><br />If you want to go vice versa, suppose you have the following flat hierarchy table:<br /><br /><span style="font-family:courier new;font-size:85%;">CREATE TABLE hierarchy_flat (</span><br /><span style="font-family:courier new;font-size:85%;">id_01 INTEGER,</span><br /><span style="font-family:courier new;font-size:85%;">name_01 VARCHAR(40),<br />id_02 INTEGER,<br />name_02 VARCHAR(40),<br />id_03 INTEGER,<br />name_03 VARCHAR(40),<br />id_04 INTEGER,<br />name_04 VARCHAR(40),<br />id_05 INTEGER,<br />name_05 VARCHAR(40),<br />id_06 INTEGER,<br />name_06 VARCHAR(40),<br />id_07 INTEGER,<br />name_07 VARCHAR(40),<br />id_08 INTEGER,<br />name_08 VARCHAR(40),<br />id_09 INTEGER,<br />name_09 VARCHAR(40),<br />id_10 INTEGER,<br />name_19 VARCHAR(40)</span><br /><span style="font-family:courier new;font-size:85%;">);</span><br /><br />.. then you can do SQL like this:<br /><br /><span style="font-family:courier new;font-size:85%;">SELECT DISTINCT id_01 node_id, NULL parent_id, name_01 node_name</span><br /><span style="font-family:courier new;font-size:85%;">FROM hierarchy_flat</span><br /><span style="font-family:courier new;font-size:85%;">UNION</span><br /><span style="font-family:courier new;font-size:85%;">SELECT DISTINCT id_02 node_id, id_01 parent_id, name_02 node_name</span><br /><span style="font-family:courier new;font-size:85%;">FROM </span><span style="font-family:courier new;font-size:85%;">hierarchy_flat WHERE id_02 IS NOT NULL</span><br /><span style="font-family:courier new;font-size:85%;">UNION</span><br /><span style="font-family:courier new;font-size:85%;">SELECT DISTINCT id_03 node_id, id_02 parent_id, name_03 node_name</span><br /><span style="font-family:courier new;font-size:85%;">FROM hierarchy_flat WHERE id_03 IS NOT NULL</span><br /><span style="font-family:courier new;font-size:85%;">UNION</span><br /><span style="font-family:courier new;font-size:85%;">SELECT DISTINCT id_04 node_id, id_03 parent_id, name_04 node_name</span><br /><span style="font-family:courier new;font-size:85%;">FROM hierarchy_flat WHERE id_04 IS NOT NULL</span><br /><span style="font-family:courier new;font-size:85%;">UNION</span><br /><span style="font-family:courier new;font-size:85%;">SELECT DISTINCT id_05 node_id, id_04 parent_id, name_05 node_name</span><br /><span style="font-family:courier new;font-size:85%;">FROM hierarchy_flat WHERE id_05 IS NOT NULL</span><br /><span style="font-family:courier new;font-size:85%;">UNION</span><br /><span style="font-family:courier new;font-size:85%;">SELECT DISTINCT id_06 node_id, id_05 parent_id, name_06 node_name</span><br /><span style="font-family:courier new;font-size:85%;">FROM hierarchy_flat WHERE id_06 IS NOT NULL</span><br /><span style="font-family:courier new;font-size:85%;">UNION</span><br /><span style="font-family:courier new;font-size:85%;">SELECT DISTINCT id_07 node_id, id_06 parent_id, name_07 node_name</span><br /><span style="font-family:courier new;font-size:85%;">FROM hierarchy_flat WHERE id_07 IS NOT NULL</span><br /><span style="font-family:courier new;font-size:85%;">UNION</span><br /><span style="font-family:courier new;font-size:85%;">SELECT DISTINCT id_08 node_id, id_07 parent_id, name_08 node_name</span><br /><span style="font-family:courier new;font-size:85%;">FROM hierarchy_flat WHERE id_08 IS NOT NULL</span><br /><span style="font-family:courier new;font-size:85%;">UNION</span><br /><span style="font-family:courier new;font-size:85%;">SELECT DISTINCT id_09 node_id, id_08 parent_id, name_09 node_name<br />FROM hierarchy_flat WHERE id_09 IS NOT NULL</span><br /><span style="font-family:courier new;font-size:85%;">UNION</span><br /><span style="font-family:courier new;font-size:85%;">SELECT DISTINCT id_10 node_id, id_09 parent_id, name_10 node_name<br />FROM hierarchy_flat WHERE id_10 IS NOT NULL</span><br /><span style="font-family:courier new;font-size:85%;">ORDER BY 1</span><br /><br />.. and you get the original parent-child table.<br /><br />Again, repeat the pattern to enable more than ten levels.Janne Pyykköhttp://www.blogger.com/profile/09970278371612488984noreply@blogger.com6tag:blogger.com,1999:blog-5996843362424013237.post-54074324396693757172007-05-22T19:43:00.000+03:002007-05-22T20:34:39.162+03:00XY chart and hierarchical graphsHere's a video that continues the XY chart ideas in my <a href="http://jpbi.blogspot.com/2007/05/xy-chart-and-graph-theory.html">previous postings</a>. (Sorry for really bad blurring. Anyway, as crappy as this video is, you can see the idea.)<br /><br /><object height="350" width="425"><param name="movie" value="http://www.youtube.com/v/w5WQYBsOwfo"><embed src="http://www.youtube.com/v/w5WQYBsOwfo" type="application/x-shockwave-flash" width="425" height="350"></embed></object><br /><br />This sample hierarchical system (hierarchical graph) consists of 25 thousand parts in maximum eight levels. It allows me to drill-down from any branch (only one branch is open at each level). When I return to an upper level, all lower levels and nodes under it disappear and new branching starts.<br /><br />Here's a better image:<br /><br /><a href="http://3.bp.blogspot.com/_GY_NfovZtsk/RlLwXyjf9RI/AAAAAAAAAHs/EqKty0KQDRA/s1600-h/xy+hierarchy.PNG"><img id="BLOGGER_PHOTO_ID_5067376822243554578" style="CURSOR: hand" alt="" src="http://3.bp.blogspot.com/_GY_NfovZtsk/RlLwXyjf9RI/AAAAAAAAAHs/EqKty0KQDRA/s320/xy+hierarchy.PNG" border="0" /></a><br /><br />This graph user interface is contextual. It's the only practical way to display large graphs in a limited space such as on a computer screen. It only shows the related nodes around the selected one. Following this principle, in a non-hierarchical graph you can display nodes linked to the central node, as well as the nodes linked to them (1st and 2nd level links). When you click a node, the clicked node becomes the central one, etc.<br /><br />This kind of user interface is good for exploring large organization structures (or any hierarchical graphs). Since the system relies on the idea of drilling-down (actually, <a href="http://jpbi.blogspot.com/2007/04/self-drilling-and-finite-state-machines.html">self-drilling</a>), it's well possible to display extra information (tables and charts) about the currently selected node near the hierarchy tree (under it or to the right of it).<br /><br />Technically, to calculate (x, y) coordinates, each hierarchy node has to know its level, the max number of levels, the order within its siblings (between 1...n), and the number of siblings (n).<br /><br />x = my sibling id / number of siblings - 1 / (2 * number of siblings)<br />y = 1 - (my level / max number of levels)<br /><br />The lines between nodes are drawn with the same kind of principles as in my previous postings.Janne Pyykköhttp://www.blogger.com/profile/09970278371612488984noreply@blogger.com0tag:blogger.com,1999:blog-5996843362424013237.post-20032316665881199782007-05-16T22:45:00.000+03:002007-05-18T18:17:30.736+03:00Accidental chart artSometimes you end up creating beauty by accident. (They are line charts.)<br /><br /><div><div><div><div><a href="http://4.bp.blogspot.com/_GY_NfovZtsk/RktgIyjf9MI/AAAAAAAAAHE/qYO2kB6tqC4/s1600-h/4-10.PNG"><img id="BLOGGER_PHOTO_ID_5065247910034142402" style="CURSOR: hand" alt="" src="http://4.bp.blogspot.com/_GY_NfovZtsk/RktgIyjf9MI/AAAAAAAAAHE/qYO2kB6tqC4/s200/4-10.PNG" border="0" /></a> <a href="http://4.bp.blogspot.com/_GY_NfovZtsk/Rktf-yjf9LI/AAAAAAAAAG8/n8ldkurU7N8/s1600-h/8-10.PNG"><img id="BLOGGER_PHOTO_ID_5065247738235450546" style="CURSOR: hand" alt="" src="http://4.bp.blogspot.com/_GY_NfovZtsk/Rktf-yjf9LI/AAAAAAAAAG8/n8ldkurU7N8/s200/8-10.PNG" border="0" /></a></div><div><a href="http://2.bp.blogspot.com/_GY_NfovZtsk/RktgQSjf9NI/AAAAAAAAAHM/bNVRd_dOCP0/s1600-h/14-10.PNG"><img id="BLOGGER_PHOTO_ID_5065248038883161298" style="CURSOR: hand" alt="" src="http://2.bp.blogspot.com/_GY_NfovZtsk/RktgQSjf9NI/AAAAAAAAAHM/bNVRd_dOCP0/s200/14-10.PNG" border="0" /></a> <a href="http://1.bp.blogspot.com/_GY_NfovZtsk/RktgZCjf9OI/AAAAAAAAAHU/XRVbjF2MZuk/s1600-h/20-10.PNG"><img id="BLOGGER_PHOTO_ID_5065248189207016674" style="CURSOR: hand" alt="" src="http://1.bp.blogspot.com/_GY_NfovZtsk/RktgZCjf9OI/AAAAAAAAAHU/XRVbjF2MZuk/s200/20-10.PNG" border="0" /></a></div><div><a href="http://2.bp.blogspot.com/_GY_NfovZtsk/RktghSjf9PI/AAAAAAAAAHc/szTrvF0YkYM/s1600-h/22-10.PNG"><img id="BLOGGER_PHOTO_ID_5065248330940937458" style="CURSOR: hand" alt="" src="http://2.bp.blogspot.com/_GY_NfovZtsk/RktghSjf9PI/AAAAAAAAAHc/szTrvF0YkYM/s200/22-10.PNG" border="0" /></a> <a href="http://2.bp.blogspot.com/_GY_NfovZtsk/RktgpSjf9QI/AAAAAAAAAHk/D9zrqgWe_WI/s1600-h/30-10.PNG"><img id="BLOGGER_PHOTO_ID_5065248468379890946" style="CURSOR: hand" alt="" src="http://2.bp.blogspot.com/_GY_NfovZtsk/RktgpSjf9QI/AAAAAAAAAHk/D9zrqgWe_WI/s200/30-10.PNG" border="0" /></a></div></div></div></div>Janne Pyykköhttp://www.blogger.com/profile/09970278371612488984noreply@blogger.com0tag:blogger.com,1999:blog-5996843362424013237.post-25199497175347791862007-05-13T16:55:00.000+03:002007-05-18T17:43:19.892+03:00XY chart and graph theoryDo you know <a href="http://en.wikipedia.org/wiki/Graph_theory">graph theory</a>? To me it's inspiring as such. In business intelligence, a graph (in the meaning of graph theory) provides a method to visualize <em>relationships </em>between entities such as between people, companies, geographical locations, etc.<br /><br /><a href="http://4.bp.blogspot.com/_GY_NfovZtsk/RkcxIr2weeI/AAAAAAAAAGs/-v3m0ZeUX78/s1600-h/simplegraph.PNG"><img id="BLOGGER_PHOTO_ID_5064070331282520546" style="CURSOR: hand" alt="" src="http://4.bp.blogspot.com/_GY_NfovZtsk/RkcxIr2weeI/AAAAAAAAAGs/-v3m0ZeUX78/s200/simplegraph.PNG" border="0" /></a><br /><br />Hierarchical systems, where each element of the system (except for the top element) is subordinate to a single other element, are graphs too. Various organization charts and genealogy charts (both in biology and as family trees) are good examples of such graphs.<br /><br />XY chart is a good way to visualize graphs. To do it clearly, at least XY must be able to:<br /><br /><br /><ul><li>Draw data points (entities)</li><li>Draw lines between data points to visualize existing relationships</li><li>Label data points (entities)</li></ul>In my <a href="http://jpbi.blogspot.com/2007/05/drawing-polygons-with-xy-chart.html">previous posting</a> I already wrote about the first two aspects. The third aspect, labeling a data point, is necessary to understand the meaning of nodes of a graph.<br /><br />The following step-by-step example shows a method how you can automatically create a graph by using XY and two tables graph_node (entities) and graph_link (relationships).<br /><br />1. Create the graph_node and graph_link tables.<br /><br /><span style="font-family:courier new;font-size:85%;">CREATE TABLE graph_node (node_id VARCHAR(3), node_name VARCHAR(40), node_order INTEGER, node_count INTEGER);</span><br /><span style="font-family:courier new;font-size:85%;">CREATE TABLE graph_link (node1_id VARCHAR(3), node2_id VARCHAR(3));</span><br /><br />2. Populate tables with countries of Europe and border neighboring relationships (or anything you wish).<br /><br /><a href="http://1.bp.blogspot.com/_GY_NfovZtsk/RkcIA72weZI/AAAAAAAAAGE/hZKC6HLGfWI/s1600-h/tablecontents.PNG"><img id="BLOGGER_PHOTO_ID_5064025118161795474" style="CURSOR: hand" alt="" src="http://1.bp.blogspot.com/_GY_NfovZtsk/RkcIA72weZI/AAAAAAAAAGE/hZKC6HLGfWI/s320/tablecontents.PNG" border="0" /></a><br /><br />Note that I have included the node_order and node_count columns in the graph_node table. They are necessary to calculate (x, y) locations of data points later on. In a real situation, you should calculate node_order and node_count automatically by using SQL (or perhaps your OLAP system is able to calculate them on the fly).<br /><br />By the way, there are 43 countries in Europe and 80 borders between them (not including Vatican and Turkey as nodes; including RUS-LTU, RUS-POL, and GBR-ESP relationships because of separate areas of Kaliningrad and Gibraltar).<br /><br />3. Create a graph_link_view based on the two tables:<br /><br /><span style="font-family:courier new;font-size:85%;">CREATE VIEW graph_link_view (link_id, datapoint_id, node_id)</span><br /><span style="font-family:courier new;font-size:85%;">AS</span><br /><span style="font-family:courier new;font-size:85%;">SELECT CONCAT(node1_id,node2_id), 0, node1_id FROM graph_link</span><br /><span style="font-family:courier new;font-size:85%;">UNION</span><br /><span style="font-family:courier new;font-size:85%;">SELECT CONCAT(node1_id,node2_id), 1, node2_id FROM graph_link</span><br /><span style="font-family:courier new;font-size:85%;">UNION</span><br /><span style="font-family:courier new;font-size:85%;">SELECT CONCAT(node1_id,node2_id), -1, NULL FROM graph_link</span><br /><span style="font-family:courier new;font-size:85%;">UNION</span><br /><span style="font-family:courier new;font-size:85%;">SELECT node_id, 0, node_id FROM graph_node</span><br /><span style="font-family:courier new;font-size:85%;">WHERE node_id not in (select node1_id from graph_link)</span><br /><span style="font-family:courier new;font-size:85%;">AND node_id not in (select node2_id from graph_link)</span><br /><span style="font-family:courier new;font-size:85%;">UNION</span><br /><span style="font-family:courier new;font-size:85%;">SELECT node_id, -1, NULL FROM graph_node</span><br /><span style="font-family:courier new;font-size:85%;">WHERE node_id not in (select node1_id from graph_link)</span><br /><span style="font-family:courier new;font-size:85%;">AND node_id not in (select node2_id from graph_link);</span><br /><br />This view is necessary because of the way the XY chart draws data points and lines between them (see my <a href="http://jpbi.blogspot.com/2007/05/drawing-polygons-with-xy-chart.html">previous posting</a>). The following image shows some contents of the view:<br /><br /><a href="http://2.bp.blogspot.com/_GY_NfovZtsk/RkcRaL2wedI/AAAAAAAAAGk/7W0k9NRGb3s/s1600-h/tablecontents2.PNG"><img id="BLOGGER_PHOTO_ID_5064035447558142418" style="CURSOR: hand" alt="" src="http://2.bp.blogspot.com/_GY_NfovZtsk/RkcRaL2wedI/AAAAAAAAAGk/7W0k9NRGb3s/s200/tablecontents2.PNG" border="0" /></a><br /><br />Each row in graph_link table produces 3 rows in graph_link_view with datapoint_id values -1, 0, and 1 (0=start line, 1=finish line, -1=break line). Some nodes such as Iceland does not have border relationships with other country nodes, so it produces 2 rows into view (-1, 0).<br /><br />4. Use the following query behind your report.<br /><br /><span style="font-family:courier new;font-size:85%;">SELECT link.link_id, link.datapoint_id, node.node_id, node.node_name, node.node_order, node.node_count</span><br /><span style="font-family:courier new;font-size:85%;">FROM graph_link_view link</span><br /><span style="font-family:courier new;font-size:85%;">LEFT OUTER JOIN graph_node node ON link.node_id = node.node_id;</span><br /><br />5. Create the measure dimension with the followin three calculations:<br /><br />X = MIN(SIN(360*node.node_order/node.node_count/180*PI()))<br />Y = MIN(COS(360*node.node_order/node.node_count/180*PI()))<br />Label = MIN(node.node_id)<br /><br />6. Create other dimensions as follows:<br /><br />Outer category = link.link_id<br />Middle category = link.datapoint_id<br />Inner category = node.node_name (=serves as a <em>tool tip label</em>)<br /><br />7. Create an XY chart and format it as follows:<br /><br /><a href="http://1.bp.blogspot.com/_GY_NfovZtsk/RkcIL72weaI/AAAAAAAAAGM/TnvdFrJij8w/s1600-h/countrylinks1.PNG"><img id="BLOGGER_PHOTO_ID_5064025307140356514" style="CURSOR: hand" alt="" src="http://1.bp.blogspot.com/_GY_NfovZtsk/RkcIL72weaI/AAAAAAAAAGM/TnvdFrJij8w/s320/countrylinks1.PNG" border="0" /></a><br /><br />The XY chart displays countries as data points in circle and border neighboring relationships between them as lines. Iceland and Malta aren't connected (they are islands).<br /><br />A few notes:<br /><br />I created this XY chart with the Voyant tool. It displays the data point label to the right of the data point. Additionally, when you move the mouse cursor over a data point, I specified the chart to display the inner category dimension as a tool tip (ISL data point displays "Iceland").<br /><br />I explored some other OLAP charting tools whether or not it's possible to create the same XY. All of them seem to fall short in displaying data point labels, such as Excel, BusinessObjects Desktop Intelligence and Web Intelligence. Why such a useful feature is not supported by them, I don't know?<br /><br />About the XY chart itself, it's actually somewhat hard to read, because the lines cross so often. Therefore I created another chart (below) in which I used longitudes and latitudes for (x, y) coordinates. Much better!<br /><br /><a href="http://4.bp.blogspot.com/_GY_NfovZtsk/RkcIWr2webI/AAAAAAAAAGU/DWhwYmUNSTU/s1600-h/countrylinks2.PNG"><img id="BLOGGER_PHOTO_ID_5064025491823950258" style="CURSOR: hand" alt="" src="http://4.bp.blogspot.com/_GY_NfovZtsk/RkcIWr2webI/AAAAAAAAAGU/DWhwYmUNSTU/s320/countrylinks2.PNG" border="0" /></a><br /><br />And of course, there's no meaning in displaying neighbor country relationships as a graph, because a simple map of Europe will do.<br /><br />Nevertheless, the method above is ok to visualize connections found within customer relationship databases. For example, supposing your CRM database contains information about meetings (when, who, on which project), you can use this data to visualize which people were involved together in certain projects at certain time period. An XY graph visualizes this much better than a table.<br /><br />I will now put an end to this posting, though I have two more issues in my mind:<br /><br /><br /><ul><li>What to do if your graph is very large?</li><li>Hierarchical graphs?</li></ul>Later...<br /><br /><span style="color:#cc0000;">UPDATE (May 14, 2007): I've forgotten two borders, one between Germany and Czech Republic and another between Switzerland and Austria.</span><br /><span style="color:#cc0000;"></span><br /><span style="color:#cc0000;">UPDATE (May 18, 2007): I recently learned that there's software for fully automated graph layout, <a href="http://www.wolfram.com/products/mathematica/newin6/content/FullyAutomatedGraphLayout/">Mathematica 6</a>. If this functionality existed in some OLAP tool, it would be nice.</span>Janne Pyykköhttp://www.blogger.com/profile/09970278371612488984noreply@blogger.com0tag:blogger.com,1999:blog-5996843362424013237.post-72652091720111989012007-05-09T18:12:00.000+03:002007-05-10T20:09:20.466+03:00XY chart and a regular polygon exampleXY chart (scatter chart) is one of my favourites, because it allows you to plot your data more freely. I already used XY in the <a href="http://jpbi.blogspot.com/2007/04/crystal-xcelcius-playful-3d-demo.html">posting about Crystal Xcelsius</a>. In this posting I will document one interesting method available in most standard OLAP tools. See the following video.<br /><br /><object height="350" width="425"><param name="movie" value="http://www.youtube.com/v/-wn9Nl_DAL4"><embed src="http://www.youtube.com/v/-wn9Nl_DAL4" type="application/x-shockwave-flash" width="425" height="350"></embed></object><br /><br />In the video, I use the first menu choice/filter to change the number of sides in a regular polygon. The second menu choice/filter allows me to decide the number of nested polygons drawn within each other.<br /><br />I used the Voyant OLAP tool in the video. Can you do it too? Probably you can!<br /><br />In this posting, I assume that your OLAP XY chart is capable of plotting both symbols and lines between symbols. I also assume that if some (x, y) coordinate is missing in between, then the line breaks, such as in the following sample XY chart in Excel.<br /><br /><a href="http://4.bp.blogspot.com/_GY_NfovZtsk/RkDXcb2weUI/AAAAAAAAAFc/J4xSiWSfY5w/s1600-h/xydrawing.PNG"><img id="BLOGGER_PHOTO_ID_5062282864678172994" style="CURSOR: hand" alt="" src="http://4.bp.blogspot.com/_GY_NfovZtsk/RkDXcb2weUI/AAAAAAAAAFc/J4xSiWSfY5w/s320/xydrawing.PNG" border="0" /></a><br /><br />Step-by-step:<br /><br />1. Create a database table:<br /><br /><span style="font-family:courier new;"><span style="font-size:85%;">CREATE TABLE integers (i INTEGER, one INTEGER)</span><br /></span><br />2. Populate the <em>integers</em> table with 32 rows: the i column contains numbers between -1 and 30; the one column has NULL for -1 and 1 for other numbers.<br /><br /><span style="font-family:courier new;font-size:85%;">INSERT INTO integers VALUES (-1, NULL);<br />INSERT INTO integers VALUES (0, 1);<br />INSERT INTO integers VALUES (1, 1);<br />INSERT INTO integers VALUES (2, 1);<br />...<br />INSERT INTO integers VALUES (29, 1);<br />INSERT INTO integers VALUES (30, 1);<br /></span><br />3. Create the following query behind your report:<br /><br /><span style="font-family:courier new;"><span style="font-size:85%;">SELECT corner_point.i, corner_point.one, corner_point_count.i, polygon.i, polygon_count.i<br />FROM<br />integers corner_point,<br />integers corner_point_count,<br />integers polygon,<br />integers polygon_count<br />WHERE corner_point.i <= corner_point_count.i</span></span><br /><span style="font-family:courier new;"><span style="font-size:85%;">AND corner_point_count.i BETWEEN 3 AND 30</span></span><br /><span style="font-family:courier new;"><span style="font-size:85%;">AND polygon.i <= polygon_count.i</span></span><br /><span style="font-family:courier new;"><span style="font-size:85%;">AND polygon.i >= 1</span></span><br /><span style="font-family:courier new;"><span style="font-size:85%;">AND polygon_count.i BETWEEN 1 AND 12;</span><br /></span><br />The <em>integers</em> table is used in four different meanings: individual corner point, number of corner points (between 3 and 30), individual polygon, number of nested polygons (between 1 and 12).<br /><br />Note that in the query, each regular polygon with n corners has n+2 corner points (-1, 0, 1, 2, 3, ..., n). One extra corner point is necessary for drawing a closed polygon. The -1 corner point is necessary to break a line before starting to draw an inner polygon (NULL corner point).<br /><br />By the way, the query returns 40404 rows. Can you calculate why?<br /><br />4. Create a report by using the following calculations for coordinates:<br /><br />x = MIN(polygon.i/polygon_count.i*SIN(360*corner_point.i/corner_point_count.i/180*PI()*corner_point.one))<br />y = MIN(polygon.i/polygon_count.i*COS(360*corner_point.i/corner_point_count.i/180*PI()*corner_point.one))<br /><br />Note that both calculations have <em>*corner_point.one</em> multiplication in it to cause the (x, y) coordinate for -1 to change into (NULL, NULL).<br /><br />5. Create other dimensions as follows:<br /><br />Menu 1 = corner_point_count.i<br />Menu 2 = polygon_count.i<br />Outer category = polygon.i<br />Inner category = corner_point.i<br /><br />6. Build the report and change the chart type into XY.<br /><br /><a href="http://3.bp.blogspot.com/_GY_NfovZtsk/RkHqM72weYI/AAAAAAAAAF8/OzodXImHKD0/s1600-h/xy+first2.PNG"><img id="BLOGGER_PHOTO_ID_5062584964087839106" style="CURSOR: hand" alt="" src="http://3.bp.blogspot.com/_GY_NfovZtsk/RkHqM72weYI/AAAAAAAAAF8/OzodXImHKD0/s320/xy+first2.PNG" border="0" /></a><br /><br />7. Change into manual scale by using limits -1 and +1 for both axes. Hide the legend, axes, and unnecessary gridlines. Change all symbols to have the same color.<br /><br /><a href="http://3.bp.blogspot.com/_GY_NfovZtsk/RkHij72weXI/AAAAAAAAAF0/2mB6PViDtUg/s1600-h/xy+second.PNG"><img id="BLOGGER_PHOTO_ID_5062576563131808114" style="CURSOR: hand" alt="" src="http://3.bp.blogspot.com/_GY_NfovZtsk/RkHij72weXI/AAAAAAAAAF0/2mB6PViDtUg/s320/xy+second.PNG" border="0" /></a><br /><br />The report is ready!<br /><br />I also tried other OLAP tools. BusinessObjects Desktop Intelligence performed well:<br /><br /><a href="http://1.bp.blogspot.com/_GY_NfovZtsk/RkGBEb2weVI/AAAAAAAAAFk/jG4LQFhGi3s/s1600-h/bo+regular+polygon.bmp"><img id="BLOGGER_PHOTO_ID_5062469369338034514" style="CURSOR: hand" alt="" src="http://1.bp.blogspot.com/_GY_NfovZtsk/RkGBEb2weVI/AAAAAAAAAFk/jG4LQFhGi3s/s320/bo+regular+polygon.bmp" border="0" /></a><br /><br />I tried BusinessObjects Web Intelligence / InfoView. Unfortunately the last step failed, because I wasn't able to draw lines between data points.<br /><br />Then I tried Microsoft SQL Server 2000 Analysis Services, but without success. There was a couple of reasons such as: the cube editor only allows equijoins between tables while my query needs <= operators; PivotTable in Excel is not able to utilize XY chart. Has the situation changed in the 2005 Analysis Services? I don't know.<br /><br />That's all for this time. Though this nested polygon example is not useful in business intelligence as such, it documents a method how you can use an XY chart to draw something new. In the following postings, I'll continue this idea and show something that is indeed very useful!Janne Pyykköhttp://www.blogger.com/profile/09970278371612488984noreply@blogger.com0tag:blogger.com,1999:blog-5996843362424013237.post-20460080980456248132007-05-02T22:06:00.000+03:002007-05-02T23:52:54.881+03:00Self-drilling and problem solving expert system--easy job!I continue the topic of <em>self-drilling</em> introduced in my <a href="http://jpbi.blogspot.com/2007/04/self-drilling-and-finite-state-machines.html">previous posting</a>. Is the following flowchart familiar to you?<br /><br /><a href="http://3.bp.blogspot.com/_GY_NfovZtsk/RjjxEL2weSI/AAAAAAAAAFM/EeYNtwkhYW0/s1600-h/problemsolving_flowchart.PNG"><img id="BLOGGER_PHOTO_ID_5060059235554982178" style="CURSOR: hand" alt="" src="http://3.bp.blogspot.com/_GY_NfovZtsk/RjjxEL2weSI/AAAAAAAAAFM/EeYNtwkhYW0/s400/problemsolving_flowchart.PNG" border="0" /></a><br /><br />It's actually a <a href="http://en.wikipedia.org/wiki/Finite_state_machine">finite state machine</a> that represents a simple <a href="http://en.wikipedia.org/wiki/Expert_system">expert system</a> driven by questions and aswers. In the database you can model the flowchart by using two tables: exp_state and exp_transition.<br /><br /><a href="http://4.bp.blogspot.com/_GY_NfovZtsk/RjjxUb2weTI/AAAAAAAAAFU/3ZsY1JTZp-s/s1600-h/problemsolving_tables.PNG"><img id="BLOGGER_PHOTO_ID_5060059514727856434" style="CURSOR: hand" alt="" src="http://4.bp.blogspot.com/_GY_NfovZtsk/RjjxUb2weTI/AAAAAAAAAFU/3ZsY1JTZp-s/s320/problemsolving_tables.PNG" border="0" /></a><br /><br />By using self-drilling reports and Voyant, I created an expert system OLAP model. See the following video:<br /><br /><object height="350" width="425"><param name="movie" value="http://www.youtube.com/v/nb9WvODOWyo"><embed src="http://www.youtube.com/v/nb9WvODOWyo" type="application/x-shockwave-flash" width="425" height="350"></embed></object><br /><br />It was quite easy. My solution is general. If you'd put thousands of questions and answers in the exp_state and exp_transition tables, it would work as perfectly as in the video. (There's no limit on the number of optional answers per question or the length of questions and answers.)<br /><br />I call this a <em>simple expert system</em>, because a true expert system should allow <a href="http://en.wikipedia.org/wiki/Fuzzy_logic">fuzzy logic</a> (probabilities between 0 and 1), which this system does not do. Also a true expert system should be able to calculate costs (or something else) based on the choices made by the user.Janne Pyykköhttp://www.blogger.com/profile/09970278371612488984noreply@blogger.com0tag:blogger.com,1999:blog-5996843362424013237.post-6613181569499309972007-04-29T11:04:00.000+03:002007-04-29T16:45:46.678+03:00Self-drilling and finite state machines, does your OLAP tool support them?Self-drilling, drilling down to itself, is an interesting concept in business intelligence reports. Among many things, it allows your OLAP tool to turn into a simulation tool much like Crystal Xcelsius. See the following video:<br /><br /><object height="350" width="425"><param name="movie" value="http://www.youtube.com/v/xsns3QKyt7w"><embed src="http://www.youtube.com/v/xsns3QKyt7w" type="application/x-shockwave-flash" width="425" height="350"></embed></object><br /><br />In the video, I'm working with the Voyant OLAP tool and a line chart. I click several times a meter control = the simulated change percent for the last three months of the year (between -100% and +100%). When clicking, the forecast (green line) moves accordingly.<br /><br />How did I do it? Self-drilling is the key.<br /><br />Step-by-step (please try it with your OLAP tool):<br /><br />1. Create a new table: <span style="font-family:courier new;font-size:85%;">CREATE TABLE sim_change_percent (change_percent INTEGER)</span><br /><br />2. Populate the table with integers between -100 and 100 = 201 rows.<br /><br />3. Define a query for your report: <span style="font-family:courier new;font-size:85%;">SELECT choice.change_percent "Choice", display.change_percent "Display" FROM sim_change_percent choice, sim_change_percent display WHERE choice.change_percent <> display.change_percent</span><br /><br />Please note that the two instances of the sim_change_percent table "choice" and "display" are joined by <> operator. The query produces 201 x 200 rows = 40 200 rows.<br /><br />4. Create a stacked column graph where you use:<br /><ul><li>Legend of the chart = measure dimension = COUNT(*)</li><li>Category axis = <em>Display</em> column of the query</li><li>Menu choice/filter = <em>Choice</em> column of the query</li></ul>The report looks like this:<br /><br /><a href="http://2.bp.blogspot.com/_GY_NfovZtsk/RjOAtL2weKI/AAAAAAAAAEM/gyvBg3rMzN0/s1600-h/step1.PNG"><img id="BLOGGER_PHOTO_ID_5058528320232126626" style="CURSOR: hand" alt="" src="http://2.bp.blogspot.com/_GY_NfovZtsk/RjOAtL2weKI/AAAAAAAAAEM/gyvBg3rMzN0/s400/step1.PNG" border="0" /></a><br /><br />Note that every column height is one (that's because of <span style="font-size:85%;">COUNT(*)</span>). Also note that when you have chosen a zero percent from the menu choice/filter above the report, the corresponding column is missing from the chart (that's because of the <> operator).<br /><br />5. Add a formula/calculation to the measure dimension: 1-[count]<br /><p>This fills the gap in the stacked column chart. Because there are two members in the legend, you can use different colors for them to show which one is selected. The next image shows colors changed and boundary lines removed.</p><p><a href="http://4.bp.blogspot.com/_GY_NfovZtsk/RjOCEr2weMI/AAAAAAAAAEc/_BdhbY84TOc/s1600-h/step2.PNG"><img id="BLOGGER_PHOTO_ID_5058529823470680258" style="CURSOR: hand" alt="" src="http://4.bp.blogspot.com/_GY_NfovZtsk/RjOCEr2weMI/AAAAAAAAAEc/_BdhbY84TOc/s400/step2.PNG" border="0" /></a></p><p>6. Now to self-drilling: Whichever way it's done in your OLAP tool, specify that clicking a column drills down to the menu choice/filter of the same report.</p><p>In Voyant, you do this by specifying interaction options.</p><p><a href="http://2.bp.blogspot.com/_GY_NfovZtsk/RjOOhL2weNI/AAAAAAAAAEk/SrG6BHPBoNQ/s1600-h/step3.PNG"><img id="BLOGGER_PHOTO_ID_5058543507236485330" style="CURSOR: hand" alt="" src="http://2.bp.blogspot.com/_GY_NfovZtsk/RjOOhL2weNI/AAAAAAAAAEk/SrG6BHPBoNQ/s320/step3.PNG" border="0" /></a></p><p>7. Change the visual display by removing the legend and both axes. Specify the gap width between columns to 0%. Add explanatory texts (titles, subtitles, footers, etc) around the meter control to make it more comprehensible. Also add a tool tip (called a <em>chart tip</em> in Voyant) to display a change percentage near the mouse cursor.</p><a href="http://4.bp.blogspot.com/_GY_NfovZtsk/RjOTrr2weOI/AAAAAAAAAEs/WmY_4d8SMQ8/s1600-h/step4.PNG"><img id="BLOGGER_PHOTO_ID_5058549185183250658" style="CURSOR: hand" alt="" src="http://4.bp.blogspot.com/_GY_NfovZtsk/RjOTrr2weOI/AAAAAAAAAEs/WmY_4d8SMQ8/s400/step4.PNG" border="0" /></a><br /><p>Thanks to self-drilling, the stacked column chart now looks and performs like a meter control.</p><p>Let's continue to the line chart with Actual, Forecast, and Budget:<br /><br />8. Create a new table: <span style="font-family:courier new;font-size:85%;">CREATE TABLE sim_money (month INTEGER, actual DOUBLE PRECISION, forecast DOUBLE PRECISION, budget DOUBLE PRECISION)</span><br /><br />9. Populate the sim_money table with the following numbers (12 rows):<br /><br /><a href="http://1.bp.blogspot.com/_GY_NfovZtsk/RjGveb2weII/AAAAAAAAAD8/SH9h1brI1PA/s1600-h/sim_money.PNG"><img id="BLOGGER_PHOTO_ID_5058016793922140290" style="CURSOR: hand" alt="" src="http://1.bp.blogspot.com/_GY_NfovZtsk/RjGveb2weII/AAAAAAAAAD8/SH9h1brI1PA/s320/sim_money.PNG" border="0" /></a><br /><br />10. Define a query for your line chart report: <span style="font-family:courier new;font-size:85%;">SELECT month, change_percent, actual, forecast, budget FROM sim_money, sim_change_percent</span><br /><br />The idea of the query above is to not join at all the sim_money and sim_change_percent tables. This cartesian product gives us 12 x 201 rows = 2412 rows.<br /><br />11. Specify a line chart by:</p><ul><li>Legend of the chart = measure dimension = SUM(actual), SUM(forecast*(100+change_percent)/100), SUM(budget)</li><li>Category axis = <em>month</em> column of the query</li><li>Menu choice/filter = <em>change_percent</em> column of the query</li></ul>After enhancing the visual display the line chart might look like this:<br /><br /><a href="http://2.bp.blogspot.com/_GY_NfovZtsk/RjRjb72weQI/AAAAAAAAAE8/8y1e3F81YCc/s1600-h/step5.PNG"><img id="BLOGGER_PHOTO_ID_5058777613018888450" style="CURSOR: hand" alt="" src="http://2.bp.blogspot.com/_GY_NfovZtsk/RjRjb72weQI/AAAAAAAAAE8/8y1e3F81YCc/s320/step5.PNG" border="0" /></a><br /><br />12. Position the two charts as in the following image, hide the menus, and make them synchronized.<br /><br /><a href="http://4.bp.blogspot.com/_GY_NfovZtsk/RjRjxb2weRI/AAAAAAAAAFE/j4qw2BX3I2s/s1600-h/step6.PNG"><img id="BLOGGER_PHOTO_ID_5058777982386075922" style="CURSOR: hand" alt="" src="http://4.bp.blogspot.com/_GY_NfovZtsk/RjRjxb2weRI/AAAAAAAAAFE/j4qw2BX3I2s/s400/step6.PNG" border="0" /></a><br /><br />Synchronization means that both reports share a global menu choice/filter (change percentage with 201 choices). Neither of the menus/filters need to be visible, because the meter control enables you choose by clicking.<br /><br />The simulation model is ready. Please leave a comment, if your OLAP tool is able to do it.<br /><br />Additional notes about self-drilling:<br /><ul><li>When using self-drilling, you always need a query with a cartesian product.</li><li>Self-drilling makes simulation models available in OLAP tools. If a report uses two simulation variables, you need two cartesian products in the query, etc.</li><li>In a larger perspective, self-drilling allows you to create <a href="http://en.wikipedia.org/wiki/Finite_state_machine">finite state machine</a> models. Transitions between states are performed by mouse clicks.</li><li>A finite-state-machine way of thinking allows you to create versatile navigational systems. This includes a simple <a href="http://en.wikipedia.org/wiki/Expert_system">expert system</a>, where a series of questions and answers help a user solve a specific problem. For example, rows of a sheet contain different choices (answers) that take you to the next state (the next question and the next set of answers).</li><li>Because of one or more cartesian products that exist behind self-drilling systems, the number of rows included in calculation grow easily very large. Thus, OLAP tools that store all the data in RAM become more easily out of action. Self-drilling works best with online data access from a database.</li></ul><p>PS. <em>Self-drilling</em> as an idea and as a term are my independent inventions, as well as the ways to utilize the concept. Please leave a comment, if you know somebody uses a different terminology about the matter. </p>Janne Pyykköhttp://www.blogger.com/profile/09970278371612488984noreply@blogger.com0tag:blogger.com,1999:blog-5996843362424013237.post-82021508787333131262007-04-22T19:34:00.000+02:002007-04-24T20:18:18.004+02:00A great book about dashboard design by Stephen Few<a href="http://1.bp.blogspot.com/_GY_NfovZtsk/RiuycES3WKI/AAAAAAAAAD0/ty3jNb04uGk/s1600-h/dashboard_bookcover.PNG"><img id="BLOGGER_PHOTO_ID_5056331201912461474" style="FLOAT: left; MARGIN: 0px 10px 10px 0px; CURSOR: hand" alt="" src="http://1.bp.blogspot.com/_GY_NfovZtsk/RiuycES3WKI/AAAAAAAAAD0/ty3jNb04uGk/s200/dashboard_bookcover.PNG" border="0" /></a>I read a book from Stephen Few: Information Dashboard Design.<br /><br />It's a great book. The idea of the book is to explain what makes business dashboards good and informative, and which mistakes we nevertheless often do and create dashboards that don't communicate.<br /><br />Those of you not familiar with business dashboards I give Stephen Few's definition: <em>A dashboard is a visual display of the most important information needed to achieve one or more objectives; consolidated and arranged on a single screen so the information can be monitored at a glange</em>.<br /><br />Why this book made me read it? Data visualization has interested me a long time. After all I was in the business for 16 years programming various graphs and making it possible to combine them in a meaningful fashion.<br /><br />I liked intense discussions with my colleaques criticizing our works, which then of course have to be re-evaluated and reprogrammed, over and over again. The process inevitably increased our knowledge of data visualization. As I have now changed my job, Stephen Few's book is a mirror for me to check where my knowledge is.<br /><br />Notes about the book:<br /><br />The author classifies things in a nice way. For example, the dashboard categorization for various roles (strategic, analytical, and operational) made me nod my head in agreement. Yes, for sure these roles have different dashboard design challenges.<br /><br />In a strategic dashboard, the big managerial picture of your business is important, budgets, actuals, and forecasts, often long terms of time (years, quarters, months), while drilling down into details is not the main issue. In an analytical dashboard, you should see everything that helps you make decisions to follow the current strategic plan, that is, by making many kind of comparisons, drilling down into details, and tracking changes in shorter terms of time (months, weeks, days). Operational dashboards are for monitoring tasks that might need action soon or right now; your interest is in intra-day periods (hours, minutes).<br /><br />Classifications continue: thirteen common mistakes in dashboards (such as exceeding the boundaries of a single screen), eleven preattentive attributes of visual perception (color, position, form, motion, and their sub-categories--what works and why), the concept of data-ink ratio (eliminate unnecessary non-data pixels, enhance data pixels, highlight most important data pixels), nine graphs that communicate best (such as bar and line charts) and graphs that should be avoided (all area graphs such as pie chart, because our ability to compare 2D areas is not precise). Very thorough.<br /><br />But the book is not only theory. It contains plenty of examples about good and poor dashboard design. I found amusing the idea of analyzing sample dashboards found from the web sites of software vendors (and a few other sources).<br /><br />As I think this approach and myself 15 years ago starting to create samples for our OLAP charting software, what is my reaction? Yes, my first examples were poor also. The following dashboard was my invention in 1997 (the buttons on the map allow you to see weather stats of the place):<br /><br /><a href="http://3.bp.blogspot.com/_GY_NfovZtsk/RipoM0S3WII/AAAAAAAAADk/AbjazwoVIOY/s1600-h/weatherdigrams.PNG"><img id="BLOGGER_PHOTO_ID_5055968101082290306" style="CURSOR: hand" alt="" src="http://3.bp.blogspot.com/_GY_NfovZtsk/RipoM0S3WII/AAAAAAAAADk/AbjazwoVIOY/s320/weatherdigrams.PNG" border="0" /></a><br /><br />Though this is not business data, mistakes are obvious: The map takes too much space compared to data (the map here is not data but a decorative image, a simple menu or list with place names would have done better), the graphs are in wrong location (top-left is the most important region for your data, not bottom-right), too many explanatory texts (better put them into another screen and allow a user to go there by using a link), too colorful (color should be used sparingly, only to highlight what is most important or something that needs an action).<br /><br />It took twelve years (!) for me to learn to design dashboards that really communicate. The following sales dashboard was my invention in 2003 (still available online <a href="http://demo.artemis.fi/xenon_sales">here</a>):<br /><br /><a href="http://4.bp.blogspot.com/_GY_NfovZtsk/Rir47ES3WJI/AAAAAAAAADs/Fwb_vgpV4hk/s1600-h/xenon_sales.PNG"><img id="BLOGGER_PHOTO_ID_5056127225325639826" style="CURSOR: hand" alt="" src="http://4.bp.blogspot.com/_GY_NfovZtsk/Rir47ES3WJI/AAAAAAAAADs/Fwb_vgpV4hk/s320/xenon_sales.PNG" border="0" /></a><br /><br />The idea is to start from top and see details appear in lower sections. Multiple comparisons are available (countries, customer segments, products, top 10 cities and customers, etc).<br /><br />Even though I primarily designed this dashboard as a demonstration of our ideas, the model was later adopted by <a href="http://procountor.com">ProCountor</a>, a Finnish bookkeeping agency on the web. Their customers were happy about the possibility to watch their sales accounts in a such rich way, not available in any other bookkeeping solution.<br /><br />Back to the book. After showing us several design mistakes, I liked the author's bold goal to design his best dashboard models, such as the sales dashboard. Everything is put out in a communicative way, what choices he went through, etc. The resulting dashboard is versatile in ways that made me ask details from a colleague who knows sales better than I. Yes, it works.<br /><br />Though a great book, I found one important area missing:<br /><br />The scope of the book is non-interactive dashboards, not interactive ones. Interaction methods are not classified (the only method even mentioned is <em>drilling-down</em>).<br /><br />As we created the Voyant OLAP tool, we included four interaction methods in it. For example, <em>simultaneousness</em> is an important principle in interactive dashboards. That is, in the online sales dashboard demo above (the link again <a href="http://demo.artemis.fi/xenon_sales">here</a>), if you choose sales "by product" in the bottom-left report, the bottom-right report is simultaneously displayed by products too.<br /><br />(I hope I have time to blog about interaction methods later.)<br /><br />By the way, Stephen Few writes a blog too. A link is provided <a href="http://perceptualedge.com/blog/">here</a> and permanently under my link list.<br /><br /><span style="color:#cc0000;">UPDATE (Tuesday, April 24, 2007):</span><br /><span style="color:#cc0000;"></span><br /><span style="color:#cc0000;">After exchanging a couple of emails with the author, it appears to me that the scope of the book is on purpose limited to "non-interactive dashboards" (my term). In fact, Stephen Few kindly guided me to his article "Dashboard Confusion Revisited", available as pdf </span><a href="http://www.perceptualedge.com/articles/03-22-07.pdf"><span style="color:#cc0000;">here</span></a><span style="color:#cc0000;">.<br /><br />In the article, he sees that "faceted analytical display" (his new term) is a different concept than "dashboard". While in a <em>dashboard</em> you should monitor the information at a glance, a <em>faceted analytical display</em> provides interaction mechanisms to reveal more perspectives (more facets) to the information.</span><br /><br /><span style="color:#cc0000;">A definition proposed by the author: <em>A "faceted analytical display" is a set of interactive charts (primarily graphs and tables) that simultaneously reside on a single screen, each of which presents a somewhat different view of a common dataset, and is used to analyze that information.</em></span><br /><br /><span style="color:#cc0000;">We need these new terms and concepts to clearly talk about the issues in this growing area of knowledge!</span>Janne Pyykköhttp://www.blogger.com/profile/09970278371612488984noreply@blogger.com4tag:blogger.com,1999:blog-5996843362424013237.post-31176494945149670482007-04-16T19:49:00.000+02:002007-04-16T22:05:26.458+02:00Crystal Xcelsius--a "what-if" analysis tool and a playful 3D demo<a href="http://www.xcelsius.com">Crystal Xcelsius</a> has been the first BI tool for me to learn in my new job. It's a great piece of software to create interactive "what-if" dashboards. It even lets you misuse itself in a playful fashion. See the following video (sorry about bad quality).<br /><object height="350" width="425"><param name="movie" value="http://www.youtube.com/v/GoTjVaaPnY8"><br /><embed src="http://www.youtube.com/v/GoTjVaaPnY8" type="application/x-shockwave-flash" width="425" height="350"></embed></object><br />In the video, I first open a PowerPoint document that contains a Flash model (blue data points and play button). After starting a slide show, the play button allows me to spin data points around a 3D ball (rotating text "<span style="font-size:85%;">ESPOO</span>" is the name of my home town).<br /><br />How did I do it? See the following flowchart about the Xcelsius development process (the Flash model of course works in a browser too).<br /><br /><a href="http://4.bp.blogspot.com/_GY_NfovZtsk/RiPVsy4TbrI/AAAAAAAAADc/5dz5kIE68uA/s1600-h/xcelcius+process.PNG"><img id="BLOGGER_PHOTO_ID_5054118172388716210" style="CURSOR: hand" alt="" src="http://4.bp.blogspot.com/_GY_NfovZtsk/RiPVsy4TbrI/AAAAAAAAADc/5dz5kIE68uA/s400/xcelcius+process.PNG" border="0" /></a><br /><br />You always build an Xcelsius model based on an Excel spreadsheet. If I simplify it a little, some Excel cells are input cells (in Xcelsius, you make play buttons, sliders, check boxes, radio buttons, combo boxes, or text boxes to change their values) and some are output cells (in Xcelsius, you create charts using those).<br /><br /><a href="http://3.bp.blogspot.com/_GY_NfovZtsk/RiOtti4TbqI/AAAAAAAAADU/4fj4EfaXNpY/s1600-h/xcelcius+slider.PNG"><img id="BLOGGER_PHOTO_ID_5054074204808507042" style="CURSOR: hand" alt="" src="http://3.bp.blogspot.com/_GY_NfovZtsk/RiOtti4TbqI/AAAAAAAAADU/4fj4EfaXNpY/s320/xcelcius+slider.PNG" border="0" /></a><br /><br />The method of using Excel and Xcelsius together means that you end up having specifications in two places. The larger the Xcelsius model grows, the more dependencies you have between the two. Adding a third component--online database access to Excel--is possible as well. As I so far don't know the best practice to build a large Xcelsius model, I consider you rather build several independent small Xcelsius dashboards instead of a combined one.<br /><br />To finally show how I created the 3D demo, below you can see the Excel speadsheet. Original data points are in B13:C62 (for example lines 13 to 25 draw the letter "E"). Cells I13:J62 are output cells for the XY chart. B5 is the input cell for the Play button.<br /><br /><a href="http://2.bp.blogspot.com/_GY_NfovZtsk/RiOPsS4TbnI/AAAAAAAAAC8/HoGN9fpQh2c/s1600-h/espooball.PNG"><img id="BLOGGER_PHOTO_ID_5054041197984837234" style="CURSOR: hand" alt="" src="http://2.bp.blogspot.com/_GY_NfovZtsk/RiOPsS4TbnI/AAAAAAAAAC8/HoGN9fpQh2c/s400/espooball.PNG" border="0" /></a><br /><br />The following image shows Crystal Xcelsius in the designer mode. The Play button is commanded to increase its value from 0 to 359 degrees continuously (auto replay selected).<br /><br /><a href="http://1.bp.blogspot.com/_GY_NfovZtsk/RiOP3C4TboI/AAAAAAAAADE/ZcscVOECKzw/s1600-h/espooball-xcelcius2.PNG"><img id="BLOGGER_PHOTO_ID_5054041382668430978" style="CURSOR: hand" alt="" src="http://1.bp.blogspot.com/_GY_NfovZtsk/RiOP3C4TboI/AAAAAAAAADE/ZcscVOECKzw/s400/espooball-xcelcius2.PNG" border="0" /></a>Janne Pyykköhttp://www.blogger.com/profile/09970278371612488984noreply@blogger.com4tag:blogger.com,1999:blog-5996843362424013237.post-31031489098148890962007-04-10T21:34:00.000+02:002007-04-16T20:53:47.403+02:00Monitoring changes over time--my best practicesMonitoring changes over time is essential in order to make better business decisions.<br /><br />Let's start with the following two graphs. (All report objects are created with the <a href="http://www.artemis.fi/voyant_eng">Voyant</a> tool. Other charting tools allow you to do corresponding graphs.)<br /><br /><kuva><a href="http://4.bp.blogspot.com/_GY_NfovZtsk/Rhvouy4TbhI/AAAAAAAAACM/wwWT8MTesx8/s1600-h/dateaxis2007_step1.PNG"></a><a href="http://3.bp.blogspot.com/_GY_NfovZtsk/Rhvpci4TbiI/AAAAAAAAACU/sl4d5KbB70Q/s1600-h/dateaxis2007_step1.PNG"><img id="BLOGGER_PHOTO_ID_5051888083634712098" style="CURSOR: hand" alt="" src="http://3.bp.blogspot.com/_GY_NfovZtsk/Rhvpci4TbiI/AAAAAAAAACU/sl4d5KbB70Q/s320/dateaxis2007_step1.PNG" border="0" /></a><br /><br /><em><span style="font-size:85%;">NOTE: Series A, B, C (etc) are general placeholders for your business data--they represent any phenomena compared over time such as actual vs. budget vs. forecast, sales vs. costs, revenues by operating segment or geographic region, capacity vs. actual and planned production, etc.</span></em><br /><br />These simple graphs display twelve months (52 weeks) in the year chosen from the menu above the chart. Good point: the charts are simple. Bad point: you cannot easily track changes that extend to two years or more.<br /><br />To enhance the charts, let's move the year dimension into the category axis.<br /><br /><kuva><a href="http://4.bp.blogspot.com/_GY_NfovZtsk/RhvrPy4TbjI/AAAAAAAAACc/szMTc6dhNE0/s1600-h/dateaxisAllYears_step2.PNG"><img id="BLOGGER_PHOTO_ID_5051890063614635570" style="CURSOR: hand" alt="" src="http://4.bp.blogspot.com/_GY_NfovZtsk/RhvrPy4TbjI/AAAAAAAAACc/szMTc6dhNE0/s320/dateaxisAllYears_step2.PNG" border="0" /></a><br /><br />Now you see changes over longer periods of time (good), but the graph contains too many years--11 years!--to help you easily observe changes that are happening right now (bad).<br /><br />To provide more space for the category axis, you need to limit the amount of data. It's a good idea to use selection criteria such as <em>date >= 1st January 2006</em>. However, using a fixed date in criteria is not good--it's better to create dynamic date criteria such as <em>date >= 1st January previous year</em> or <em>date >= 1st day of current month previous year</em>. Additionally, you often want to limit dates in the future using <em>date <= last day of the next year</em> or something similar. Let's see how the graphs are enhanced.<br /><br /><a href="http://2.bp.blogspot.com/_GY_NfovZtsk/Rhvv-S4TbkI/AAAAAAAAACk/aMjm2SAJwAY/s1600-h/dateaxisLimited_step3.PNG"><img id="BLOGGER_PHOTO_ID_5051895260525063746" style="CURSOR: hand" alt="" src="http://2.bp.blogspot.com/_GY_NfovZtsk/Rhvv-S4TbkI/AAAAAAAAACk/aMjm2SAJwAY/s320/dateaxisLimited_step3.PNG" border="0" /></a><br /><br />As we now live April 2007 (current date printed in the bottom-right corner of the graph), it's easy to see actual and forecasted changes at a glance.<br /><br />Of course it's up to your business what is the relevant period and frequency of data: previous-current-next year, last 12 months, or something else? Monthly, weekly, or daily data? Generally it's ok to watch about 50-100 data points spread on the date axis as soon as the software presents them clearly. This equals to about 8 years of monthly data, 1.5 years of weekly data, or 3 months of daily data, such as in the following image.<br /><br /><a href="http://3.bp.blogspot.com/_GY_NfovZtsk/Rhvz8i4TblI/AAAAAAAAACs/PVosB1pMP84/s1600-h/dateaxisDaily_step4b.PNG"><img id="BLOGGER_PHOTO_ID_5051899628506803794" style="CURSOR: hand" alt="" src="http://3.bp.blogspot.com/_GY_NfovZtsk/Rhvz8i4TblI/AAAAAAAAACs/PVosB1pMP84/s320/dateaxisDaily_step4b.PNG" border="0" /></a><br /><br />Nevertheless more than hundred data points are well possible, if the software provides a date axis that scales dynamically. See the following Voyant charts with 261 weekly values (5 years, each data point in Thursday) and 396 daily values (last 12+1 months)--a lot of data but very clear. The software allows you to check the exact date by moving a mouse cursor onto a line.<br /><br /><a href="http://3.bp.blogspot.com/_GY_NfovZtsk/Rhv6Wi4TbmI/AAAAAAAAAC0/_-bDJRkFpW4/s1600-h/dateaxisLongPeriod_step6.PNG"><img id="BLOGGER_PHOTO_ID_5051906672253169250" style="CURSOR: hand" alt="" src="http://3.bp.blogspot.com/_GY_NfovZtsk/Rhv6Wi4TbmI/AAAAAAAAAC0/_-bDJRkFpW4/s320/dateaxisLongPeriod_step6.PNG" border="0" /></a><br /><br />Technically, it depends on the database how to create dynamic date criteria and other issues discussed above. The following examples show some methods. Feel free to use and enhance them in your own projects.<br /><br />In Oracle:<br /><ul><li>Dates after first day of this year:<span style="font-family:courier new;font-size:85%;"> WHERE date >= TRUNC(SYSDATE,'YYYY')</span></li><li>Dates after first day of previous month:<span style="font-family:courier new;font-size:85%;"> WHERE date >= TRUNC(ADD_MONTHS(SYSDATE,-1),'MM')</span></li><li>Dates between previous year and next year (three years):<span style="font-family:courier new;font-size:85%;"> WHERE date BETWEEN TRUNC(SYSDATE-365,'YYYY') AND TRUNC(SYSDATE+2*365,'YYYY')-1</span></li><li>Dates during the last 12 months (including this month previous year):<span style="font-family:courier new;font-size:85%;"> WHERE date BETWEEN TRUNC(ADD_MONTHS(SYSDATE,-12),'MM') AND TRUNC(ADD_MONTHS(SYSDATE,1),'MM')-1</span></li></ul><p>In ODBC (SQL Server, MySQL, and many others):</p><ul><li>Dates after first day of this year:<span style="font-family:courier new;font-size:85%;"> WHERE date >= {fn TIMESTAMPADD(SQL_TSI_DAY,-{fn DAYOFYEAR({fn CURDATE()})}+1, {fn CURDATE()})}</span></li><li>Dates after first day of previous month:<span style="font-family:courier new;font-size:85%;"> WHERE date >= {fn TIMESTAMPADD(SQL_TSI_DAY,-{fn DAYOFMONTH({fn TIMESTAMPADD(SQL_TSI_MONTH,-1,{fn CURDATE()})})}+1, {fn TIMESTAMPADD(SQL_TSI_MONTH,-1,{fn CURDATE()})})}</span></li><li>Dates between previous year and next year (three years):<span style="font-family:courier new;font-size:85%;"> WHERE date BETWEEN {fn TIMESTAMPADD(SQL_TSI_DAY,-{fn DAYOFYEAR({fn TIMESTAMPADD(SQL_TSI_DAY,-365,{fn CURDATE()})})}+1, {fn TIMESTAMPADD(SQL_TSI_DAY,-365,{fn CURDATE()})})} AND {fn TIMESTAMPADD(SQL_TSI_DAY,-{fn DAYOFYEAR({fn TIMESTAMPADD(SQL_TSI_DAY,2*365,{fn CURDATE()})})}, {fn TIMESTAMPADD(SQL_TSI_DAY,2*365,{fn CURDATE()})})}</span></li><li>Dates during the last 12 months (including this month previous year):<span style="font-family:courier new;font-size:85%;"> WHERE date BETWEEN {fn TIMESTAMPADD(SQL_TSI_DAY,-{fn DAYOFMONTH({fn TIMESTAMPADD(SQL_TSI_MONTH,-12,{fn CURDATE()})})}+1, {fn TIMESTAMPADD(SQL_TSI_MONTH,-12,{fn CURDATE()})})} AND {fn TIMESTAMPADD(SQL_TSI_DAY,-{fn DAYOFMONTH({fn TIMESTAMPADD(SQL_TSI_MONTH,1,{fn CURDATE()})})}+1, {fn TIMESTAMPADD(SQL_TSI_MONTH,1,{fn CURDATE()})})}</span></li></ul>You also need expressions to create years, months, weeks, etc.<br /><br />In Oracle: <ul><li>Year:<span style="font-family:courier new;font-size:85%;"> TO_NUMBER(TO_CHAR(date,'YYYY'))</span></li><li>Month:<span style="font-family:courier new;font-size:85%;"> TO_NUMBER(TO_CHAR(date,'MM'))</span></li><li>Year (when used with ISO standard week):<span style="font-family:courier new;font-size:85%;"> TO_NUMBER(TO_CHAR(date,'IYYY'))</span></li><li>Week, ISO standard starting on Monday:<span style="font-family:courier new;font-size:85%;"> TO_NUMBER(TO_CHAR(date,'IW'))</span></li><li>Day of month:<span style="font-family:courier new;font-size:85%;"> TO_NUMBER(TO_CHAR(date,'DD'))</span></li><li>Date as such:<span style="font-family:courier new;font-size:85%;"> date</span></li><li>Week (as a date value of Thursday = the middle day of week):<span style="font-family:courier new;font-size:85%;"> date-TO_NUMBER(TO_CHAR(date,'D'))+4</span></li></ul><p>In ODBC (SQL Server, MySQL, and many others):</p><ul><li>Year:<span style="font-family:courier new;font-size:85%;"> {fn YEAR(date)}</span></li><li>Month:<span style="font-family:courier new;font-size:85%;"> {fn MONTH(date)}</span></li><li>Year (when used with ISO standard week):<span style="font-family:courier new;font-size:85%;"> {fn YEAR({fn TIMESTAMPADD(SQL_TSI_DAY,-{fn MOD({fn DAYOFWEEK(date)}+5,7)}+3,date)})}</span></li><li>Week, ISO standard starting on Monday:<span style="font-family:courier new;font-size:85%;"> {fn FLOOR(1+({fn DAYOFYEAR({fn TIMESTAMPADD(SQL_TSI_DAY,-{fn MOD({fn DAYOFWEEK(date)}+5,7)}+3,date)})}-1)/7)}</span></li><li>Day of month:<span style="font-family:courier new;font-size:85%;"> {fn DAYOFMONTH(date)}</span></li><li>Date as such:<span style="font-family:courier new;font-size:85%;"> date</span></li><li>Week (as a date value of Thursday = the middle day of week):<span style="font-family:courier new;font-size:85%;"> {fn TIMESTAMPADD(SQL_TSI_DAY,-{fn MOD({fn DAYOFWEEK(date)}+5,7)}+3,date)}</span></li></ul><p>Happy charting :)</p>Janne Pyykköhttp://www.blogger.com/profile/09970278371612488984noreply@blogger.com2tag:blogger.com,1999:blog-5996843362424013237.post-69873895708518018032007-04-02T19:07:00.000+02:002007-04-04T14:03:44.760+02:00Red sunsets and air quality--what chart to use?Spring is two weeks early in Southern Finland. Ice and snow has melted, brooks and rivers are full of water, birds migrate north over the Gulf of Finland, and sunsets are beautiful reddish.<br /><br /><a href="http://4.bp.blogspot.com/_GY_NfovZtsk/RhEjnJX8q4I/AAAAAAAAAB0/pHKJUgtMv8I/s1600-h/Copy+of+sunset.jpg"><img id="BLOGGER_PHOTO_ID_5048855812697271170" style="FLOAT: left; MARGIN: 0px 10px 10px 0px; CURSOR: hand" alt="" src="http://4.bp.blogspot.com/_GY_NfovZtsk/RhEjnJX8q4I/AAAAAAAAAB0/pHKJUgtMv8I/s200/Copy+of+sunset.jpg" border="0" /></a>Red sunsets? They might please the eye, but also make my eyes itch when riding my bike. The red color is in fact caused by numerous air pollutants: small particles released from snow and spread all over by traffic, also from burning branches and trees. I'm waiting a heavy rain to wash the dust away.<br /><br />(Aside: The Japanese flag represents the rising red sun, probably designed thousand years ago. Did the designer live in a village where people burned a lot of tree in wintertime? Did he or she understand our modern age will see in the flag a proof of air pollution?)<br /><br />I'm slowly getting towards my point, data visualization, one of my key interests :)<br /><br /><div><div></div><div>Since the air quality is a worry to people nowadays, the largest newspaper in Finland, Helsingin Sanomat, published the following chart a few days ago.<br /></div><div><a href="http://3.bp.blogspot.com/_GY_NfovZtsk/RhECh5X8q2I/AAAAAAAAABk/eLafU-tuEBs/s1600-h/airquality_big.PNG"><img id="BLOGGER_PHOTO_ID_5048819438619241314" style="FLOAT: left; MARGIN: 0px 10px 10px 0px; CURSOR: hand" alt="" src="http://3.bp.blogspot.com/_GY_NfovZtsk/RhECh5X8q2I/AAAAAAAAABk/eLafU-tuEBs/s200/airquality_big.PNG" border="0" /></a><em><strong>Air quality in three different places during recent five days</strong></em><br /></div><div>Is the data output well designed? Not at all.</div><div></div><div>When air quality is good, bars are short; when bad, long. Yep. I'm able to get the information but still, the BAR CHART makes me ask in what sense bad air should be "longer" or "more" than good air?<br /><br />Another thing, there are colored bands in the background to tell the air quality in words: <em>good</em>, <em>satisfactory</em>, <em>so-so</em>, <em>bad</em>, <em>very bad</em>. The bands for <em>satisfactory</em> and <em>so-so</em> are narrower than <em>good</em> and <em>bad</em>, while <em>very bad</em> is a much wider band than others. Why? Does that carry some message I should understand?<br /><br />A probable explanation: The graphic designer gets exact figures of air pollutant concentrations every day. Color bands are concentration ranges defined by health care authorities. When drawing the chart, the designer hides exact numbers and adds color bands to correspond the <em>good</em>, <em>satisfactory</em>, <em>so-so</em>, etc air quality ranges.<br /><br />But we humans don't think like this!</div><div></div><div>To us the air quality is either good or bad (or something between the opposites) without having to think which extreme is "more" and which is "less". The scale is irrelevant as soon as you can understand changes over time.<br /></div><div>Below is my proposition how the newspaper should do it (the data is partially my imagination such as forecasted rains starting tomorrow).</div><div></div><div></div><a href="http://4.bp.blogspot.com/_GY_NfovZtsk/RhEuiJX8q6I/AAAAAAAAACE/qt52Q7J5Opw/s1600-h/airqualitybetter.PNG"><img id="BLOGGER_PHOTO_ID_5048867821425830818" style="CURSOR: hand" alt="" src="http://4.bp.blogspot.com/_GY_NfovZtsk/RhEuiJX8q6I/AAAAAAAAACE/qt52Q7J5Opw/s320/airqualitybetter.PNG" border="0" /></a><br /><div></div><div>In this presentation of two charts, air quality is <em>no more a separate phenomenon</em> but comparable to temperature and precipitation (which indeed influence the air quality). A <em>line chart</em> displays changes in air quality clearly. A <em>longer period</em> helps readers put the current air quality in context.</div><div>Much better than the original, don't you think?</div><div></div><div>How about you? Do you also share the same passion to enhance charts and tables printed in daily papers? :)</div><pic></div>Janne Pyykköhttp://www.blogger.com/profile/09970278371612488984noreply@blogger.com1tag:blogger.com,1999:blog-5996843362424013237.post-85778751303172503192007-03-29T21:36:00.000+02:002007-04-02T14:08:26.847+02:00My DW success storyData warehousing does have an influence on the society.<br /><br />This one I noted in Finland after having created a DW for the Central Union of Agricultural Producers and Forest Owners (in Finnish MTK). See the chart below (FYI, texts are in Finnish).<br /><br /><a href="http://3.bp.blogspot.com/_GY_NfovZtsk/RgfyfpO37aI/AAAAAAAAAAU/42iZLPBO_Tc/s1600-h/puukauppa.JPG"><img id="BLOGGER_PHOTO_ID_5046268532950035874" style="CURSOR: hand" alt="" src="http://3.bp.blogspot.com/_GY_NfovZtsk/RgfyfpO37aI/AAAAAAAAAAU/42iZLPBO_Tc/s400/puukauppa.JPG" border="0" /></a><br /><br />In 2005, timber price (fir, pine) was less than 48 €/m3. After reports and charts (such as the one above) were published on the web in the end of 2005, log prices have risen more than 20% soon breaking up 60 €/m3.<br /><br />What happened? Before the system was published 300,000 (three hundred thousand) Finnish forest owners were in bad market situation against three giant pulp mill companies that buy most of timber. By voluntarily sharing price information the forest owners as well as their local associations became more aware about the market. (There was also other factors increasing log price such as a change in forest taxation, but that's another story.)<br /><br />Technically, the data is extracted from the databases of 150 timber associations around the country (each association decides how often). If a forest owner uses the local association to sell timber (40% of cases), the data is later on loaded into a central database (we use our own ETL tool JobRunner for that). Finally, the <a href="http://www.artemis.fi/voyant_eng">Voyant</a> OLAP tool made it possible to publish the information clearly on <a href="http://www.mtk.fi">MTK's website</a>.<br /><br />By the way, the three pulp mill companies are now in court accused of controlling log prices against antitrust laws (more information <a href="http://www.kilpailuvirasto.fi/cgi-bin/english.cgi?luku=news-archive&sivu=news/n-2006-12-21">here</a>). Will there emerge a new timber purchasing cartel in Finland some day? Because of the DW, I doubt.Janne Pyykköhttp://www.blogger.com/profile/09970278371612488984noreply@blogger.com0tag:blogger.com,1999:blog-5996843362424013237.post-31130044029345348372007-03-28T17:20:00.000+02:002007-04-04T14:02:29.139+02:00Hello worldWhile I soon start to work in <a href="http://www.affecto.com/page.php?page_id=38">Affecto</a>, one of the largest business intelligence companies in Northern Europe, it's a good moment to start blogging about BI.<br /><br />About my background, I have 16 years of history in business intelligence. All these years I've been in the developer side, creating an OLAP tool <a href="http://www.artemis.fi/voyant_eng">Voyant</a>. I'm proud to tell that the software has been appreciated by customers for its ease of use and productivity, nevertheless it has not gained a big success.<br /><br />Data visualization is one of my key interests, so my goal is to include an image or two in every posting. The next one is a dashboard I once made to express Voyant's elegant customizable web interface (click <a href="http://demo.artemis.fi/world_data/world_data2.html">here</a> for an online web demo).<br /><br /><a href="http://3.bp.blogspot.com/_GY_NfovZtsk/RgqSApX8qwI/AAAAAAAAAA0/BBF9CWcq4D0/s1600-h/scatterplot.PNG"><img id="BLOGGER_PHOTO_ID_5047006872226016002" style="CURSOR: hand" alt="" src="http://3.bp.blogspot.com/_GY_NfovZtsk/RgqSApX8qwI/AAAAAAAAAA0/BBF9CWcq4D0/s400/scatterplot.PNG" border="0" /></a><br /><br />For those of you being technically interested, Voyant relies on lightweight Java applets in the browser environment (three applets in the example above). The Server version delivers data between database and applets. The Designer and Analyst desktop versions are for specifying report objects and their interaction.<br /><br />Outside of data visualization (and Voyant), I'll plan to write about good and bad sides of those OLAP/BI tools I'll be working with as a consultant during the following months.<br /><br />I hope my BI background will help me see and write about business intelligence in a fresh way.Janne Pyykköhttp://www.blogger.com/profile/09970278371612488984noreply@blogger.com0