Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Aqsa has completed her research work on "visualization on pingER data" and now working on to publish a conference paper. Here are few details of the research work.

    • The query results can be exported as CSV file. I use the CSV file of Query results from Impala to draw Line and Bar charts by using Google API.

    • She has created a Data warehouse on pingER data. First we transform the pingER text files into CSV files. Then i upload these CSV files on HDFS and populate Impala Tables and queries.

    • Line and Bar charts are created on a webpage running or executed by localhost server. it can be updated as the query results varies.
  • Aqsa has put together an  abstract of a conference paper and submitted to "The 3rd IEEE/ACM International Conference on Big Data Science, Engineering and Applications (BDSEA 2016)" (http://computing.derby.ac.uk/bdcat2016/).  "Applying Big Data Warehousing and Visualization Techniques on pingER Data",  Aqsa Hameed, Dr. Saqib Ali, Dr. Les Cottrell and Bebo White, submitted to BDSEA 2016 2016. Authors can access it via:   https://easychair.org/conferences/?conf=bdsea2016  

  • Aqsa and Saba are working together. Their goal is focusing on visualization of PingER historical data using warehouse.  The idea is to develop a warehouse in UAF university and make it publicly available. They are 50-60% done with setting up a Hadoop cluster with 3 nodes, 1 master, 2 slaves. She is currently working on importing the PingER data into hdfs on the  cluster. they have run some Impala queries on the data and are working on visualization
    • Topic: visualization on pingER data (email from Aqsa and Response from Renan)
      I have studied the google charts as visualization tools but here are some points need to be discussed.
      1. The idea of applying visualization on Data warehouse (Impala query results) does not seem to be so useful because Data warehouse contains static data and visualization charts will also remains static and need to be updated with the time.

      Yes, it needs to be updated with the time. My suggestion is to transform PingER data into data to be inserted into the data warehouse. Myself and some other Brazilian students have developed codes to do this. Such process should occur at least once a day to keep the data warehouse updated daily. This has never been done by any of us.

      2. Google charts API cannot integrate with Impala As Impala is hadoop distributed Big Data supported database Google can only integrate with flat files or flat databases like Mysql. 

      If Google charts API can only read flat files (e.g., CSV files), it is trivial to save a database query result as a CSV flat file that would be consumed by Google charts.  Can Google charts generate a plot dynamically after reading a just-created CSV file? 
      Is using a different data visualization library (e.g., D3 https://d3js.org/ ) an option? 
    • Aqsa and  team members are working on creating Data warehouse and we are very close to complete this. Here are some updates.

      • Tehseen qureshi has transformed the pingER text files into binaries and soon he will be able to get CSV files.

      • Saba is working on defining a 4 node cluster.
      • Aqsa has uploaded some sample CSV files on HDFS and run Impala queries as i will get the actual CSV from tehseen these steps are also will be completed
    • Visualization Status 

    • Aqsa has 

      drawn a line chart and bar chart on the data of sample CSV file and i am exploring some more charts to be drawn by using Google API's.

...