<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Exploratory Data Analysis on Welcome</title><link>https://yujia21.github.io/tags/exploratory-data-analysis/</link><description>Recent content in Exploratory Data Analysis on Welcome</description><generator>Hugo</generator><language>en-us</language><copyright>&lt;a href="https://creativecommons.org/licenses/by-nc/4.0/" target="_blank" rel="noopener"&gt;CC BY-NC 4.0&lt;/a&gt;</copyright><lastBuildDate>Sat, 13 Nov 2021 00:00:00 +0000</lastBuildDate><atom:link href="https://yujia21.github.io/tags/exploratory-data-analysis/index.xml" rel="self" type="application/rss+xml"/><item><title>Recent resale flat sales</title><link>https://yujia21.github.io/portfolio/2021-11-13-resale/</link><pubDate>Sat, 13 Nov 2021 00:00:00 +0000</pubDate><guid>https://yujia21.github.io/portfolio/2021-11-13-resale/</guid><description>&lt;h1 id="dataset"&gt;Dataset&lt;/h1&gt;
&lt;p&gt;Dataset retrieved as usual from &lt;a href="https://data.gov.sg/dataset/resale-flat-prices"&gt;Singapore&amp;rsquo;s public data&lt;/a&gt;.
Only data after 2017, in the towns of Queenstown, Bukit Merah, and Clementi are displayed in the graph below.&lt;/p&gt;
&lt;h1 id="interactive-graph"&gt;Interactive graph&lt;/h1&gt;
&lt;p&gt;(This has been removed temporarily as it is too large)&lt;/p&gt;
&lt;h1 id="process"&gt;Process&lt;/h1&gt;
&lt;ul&gt;
&lt;li&gt;Data cleanup&lt;/li&gt;
&lt;li&gt;Learning basic JavaScript to write callbacks to filter dataset in Bokeh graph, inspiration &lt;a href="https://stackoverflow.com/questions/69801567/how-to-filter-a-bokeh-visual-based-on-geojson-data"&gt;here&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Not the most optimized and will be slow to reload due to having to loop through dataset with the filter&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>What data does TraceTogether really track?</title><link>https://yujia21.github.io/portfolio/2020-04-10-trace-together/</link><pubDate>Fri, 10 Apr 2020 00:00:00 +0000</pubDate><guid>https://yujia21.github.io/portfolio/2020-04-10-trace-together/</guid><description>&lt;h1 id="tracetogether"&gt;TraceTogether&lt;/h1&gt;
&lt;p&gt;In the midst of the COVID19 crisis, the Singapore government has developed an app called&lt;a href="https://www.tracetogether.gov.sg/"&gt; TraceTogether&lt;/a&gt;, on which &lt;a href="https://splira.com/2020-03-28/"&gt;much&lt;/a&gt; &lt;a href="https://digitalreach.asia/tracetogether-disassembling-it-wasnt-easy-to-confirm-the-governments-claims-on-privacy/"&gt;ink&lt;/a&gt; &lt;a href="https://www.reddit.com/r/singapore/comments/fno0me/tracetogether_doesnt_seem_to_collect_much_info/"&gt;has&lt;/a&gt; &lt;a href="https://github.com/opentrace-community"&gt;already&lt;/a&gt; &lt;a href="https://www.tech.gov.sg/media/technews/tracetogether-behind-the-scenes-look-at-its-development-process"&gt;been&lt;/a&gt; &lt;a href="https://www.reddit.com/r/singapore/comments/ftpc3r/oc_i_extracted_my_own_tracetogether_data_from_my/"&gt;spilt&lt;/a&gt;. The goal of the app is to aid in contact tracing by keeping track of people you might come into close contact with unknowingly (for example, on the train, at a hawker center) via tracking bluetooth signals. I won&amp;rsquo;t go into more detail on how the app works or on privacy concerns since those are not really my areas of &amp;ldquo;expertise&amp;rdquo;, but merely be content with examining the data collected in my TraceTogether app.&lt;/p&gt;</description></item><item><title>Which boat should my rowing club buy?</title><link>https://yujia21.github.io/portfolio/2020-03-15-rowing-club-stats/</link><pubDate>Sun, 15 Mar 2020 00:00:00 +0000</pubDate><guid>https://yujia21.github.io/portfolio/2020-03-15-rowing-club-stats/</guid><description>&lt;h1 id="dataset"&gt;Dataset&lt;/h1&gt;
&lt;p&gt;As one of the resident data scientists of my &lt;a href="http://easterrowing.club"&gt;rowing club&lt;/a&gt;, I decided to play a little with the data generated by boat booking over the year 2019. Data was collected from a shared online Google Sheets file, where members wrote down their names for the boat they intended to use and the date/time where they were rowing. Since this analysis was done, stricter logging policies were implemented (standardizing names used by members by forcing the choice to be made from a drop down list, and enforcing full list of crew names for crew boats), so next year&amp;rsquo;s of data analysis should be less painful! Of course, this dataset doesn&amp;rsquo;t take into account those with private single boats - they appear on the sheet only when they row in crew boats.&lt;/p&gt;</description></item><item><title>How much does it rain in Singapore?</title><link>https://yujia21.github.io/portfolio/2019-09-15-rainfall-sg/</link><pubDate>Sun, 15 Sep 2019 00:00:00 +0000</pubDate><guid>https://yujia21.github.io/portfolio/2019-09-15-rainfall-sg/</guid><description>&lt;h1 id="dataset"&gt;Dataset&lt;/h1&gt;
&lt;p&gt;Real time weather data can be retrieved from this &lt;a href="https://data.gov.sg/dataset/realtime-weather-readings"&gt;API provided by NEA and the Singapore government&lt;/a&gt; and is available for free. We have air temperature, rainfall, windspeed, and wind direction data.&lt;/p&gt;
&lt;p&gt;Data can only be fetched day by day or for a specific timestep through this API. It is pretty simple to write a quick script to request data from the API using Python&amp;rsquo;s &amp;ldquo;requests&amp;rdquo; library. Data is returned in json format and subsequently converting the data into a pandas dataframe just takes a little more work (a naive conversion into a dataframe results in two columns: one for the timestamp and one where the entry for each timestep is a list of dictionaries of the form {&amp;ldquo;sensor id&amp;rdquo;: xx, &amp;ldquo;value&amp;rdquo;, yy}).&lt;/p&gt;</description></item><item><title>What do Singapore's electricity prices look like?</title><link>https://yujia21.github.io/portfolio/2019-08-31-usep-sg/</link><pubDate>Sat, 31 Aug 2019 00:00:00 +0000</pubDate><guid>https://yujia21.github.io/portfolio/2019-08-31-usep-sg/</guid><description>&lt;h1 id="dataset"&gt;Dataset&lt;/h1&gt;
&lt;p&gt;The following data is retrieved from the &lt;a href="https://www.emcsg.com/marketdata/priceinformation"&gt;Energy Market Company website&lt;/a&gt; and is available for free.&lt;/p&gt;
&lt;p&gt;This dataset comprises of the 2018 Uniform Singapore Energy Price (USEP) as well as the overall demand in Singapore in MW, among other variables that are not shown below. Data is available in half hourly periods. The USEP is here in $/MWh, and is determined by the demand and supply of the electricity market of the period in question.&lt;/p&gt;</description></item></channel></rss>