0% found this document useful (0 votes)
93 views42 pages

Lecture 1

This document provides an overview of geographic information systems (GIS), big data, the internet of things, and sample analytics. It discusses how GIS allows users to visualize and analyze spatial data to understand relationships. It also describes the volume, velocity, and variety characteristics of big data. The internet of things section outlines how devices collect and transmit data over the internet in areas like smart homes and transportation. The document concludes by noting that GIS tools can facilitate insights from big data by enabling spatial analysis of large datasets.

Uploaded by

2021868564
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Topics covered

  • Real-time Data,
  • GIS,
  • Data Relationships,
  • Data Management,
  • Data Visualization,
  • Data Integration,
  • Data Mining,
  • Spatial Data,
  • NoSQL Databases,
  • Data Variety
0% found this document useful (0 votes)
93 views42 pages

Lecture 1

This document provides an overview of geographic information systems (GIS), big data, the internet of things, and sample analytics. It discusses how GIS allows users to visualize and analyze spatial data to understand relationships. It also describes the volume, velocity, and variety characteristics of big data. The internet of things section outlines how devices collect and transmit data over the internet in areas like smart homes and transportation. The document concludes by noting that GIS tools can facilitate insights from big data by enabling spatial analysis of large datasets.

Uploaded by

2021868564
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Topics covered

  • Real-time Data,
  • GIS,
  • Data Relationships,
  • Data Management,
  • Data Visualization,
  • Data Integration,
  • Data Mining,
  • Spatial Data,
  • NoSQL Databases,
  • Data Variety

Pusat Pengajian Sains Ukur dan Geomatik

Universiti Teknologi MARA


40450 Shah Alam
Selangor Darul Ehsan

Compiled by Maisarah Abdul Halim


Overview:
•GIS & GIS Applications
•Big Data
•Internet of Things
•Sample Analytics
G I S eographic nformation

• A geographic information system (GIS)


ystems

lets us visualize, question, analyze, and


interpret spatial data to understand
relationships, patterns, and trends
• Spatial is special
• Almost everything that happens,
happens somewhere
GIS
APPLICATIONS
Big Data- What is Big Data?
• ‘Big-data’ is similar to ‘Small-data’, but bigger
• …but having data bigger consequently requires
different approaches:
◦ techniques, tools & architectures
• …to solve:
◦ New problems…
◦ …and old problems in a better way.
Characterization of Big Data – 3 V’s
Extremely large data sets that may be
analysed computationally to reveal
patterns, trends, and associations,
especially relating to human behaviour and
interactions

Source: http://www.igi-global.com/dictionary/big-data/39008
Volume (Scale)
• Data Volume
• 44x increase from 2009 2020
• From 0.8 zettabytes to 35zb
• Data volume is increasing exponentially

Exponential increase in collected/generated


data
4.6 billion
30 billion RFID tags camera
12+ TBs today phones world
of tweet data (1.3B in 2005) wide
every day

100s of
millions of
GPS
data every day

enabled
? TBs of

devices sold
annually

25+ TBs of
log data every 2+ billion
day people on the
Web by end
2011

76 million smart meters


Velocity (Speed)

• Data is begin generated fast and need to be processed fast


• Online Data Analytics
• Late decisions ➔ missing opportunities
• Examples
• E-Promotions: Based on your current location, your purchase history, what you like ➔
send promotions right now for store next to you

• Healthcare monitoring: sensors monitoring your activities and body ➔ any abnormal
measurements require immediate reaction
Real-time/Fast Data

Mobile devices
(tracking all objects all the time)

Social media and networks Scientific instruments


(all of us are generating data) (collecting all sorts of data)

Sensor technology and networks


(measuring all kinds of data)

• The progress and innovation is no longer hindered by the ability to collect data
• But, by the ability to manage, analyze, summarize, visualize, and discover knowledge from
the collected data in a timely manner and in a scalable fashion
Variety (Complexity)
• Relational Data (Tables/Transaction/Legacy Data)
• Text Data (Web)
• Semi-structured Data (XML)
• Graph Data
• Social Network, Semantic Web (RDF), …

• Streaming Data
• You can only scan the data once

• A single application can be generating/collecting


many types of data

• Big Public Data (online, weather, finance, etc)

To extract knowledge➔ all these types of


data need to linked together
Additional V’s :
• Veracity: much of geospatial big data are from unverified sources with low or
unknown accuracy, level of accuracy varies depending on data sources, raising
issues on quality assessment of source data and how to “statistically” improve
the quality of analysis results.
• Visualization: provides valuable procedures to impose human thinking into
big data analysis. Visualizations help analysts identifying patterns (such as
outliers and clusters).
• Visibility: the emergence of cloud computing and cloud storage has made it
possible to now efficiently access and process geospatial big data in ways that
were not previously possible.
Big Data- Big Analytics
• Complex math operations (machine learning, clustering,
trend detection, ….)
• Mostly specified as linear algebra on array data
• Integrated with complex analytics
• Specified as arrays, not tables
• The challenges include capture, curation, storage,
search, sharing, transfer, analysis, and visualization.
Figures for one day of NYTimes’ Website
• 50Gb of uncompressed log files
• 10Gb of compressed log files
• 0.5Gb of processed log files
• 50-100M clicks
• 4-6M unique users
• 7000 unique pages with more then 100 hits
• Index size 2Gb
• Pre-processing & indexing time
• ◦ ~10min on workstation (4 cores & 32Gb)
• ◦ ~1hour on EC2 (2 cores & 16Gb)
Tools typically used in Big Data Scenario
• NoSQL
◦ DatabasesMongoDB, CouchDB, Cassandra, Redis, BigTable, Hbase,
Hypertable, Voldemort, Riak, ZooKeeper
• MapReduce
◦ Hadoop, Hive, Pig, Cascading, Cascalog, mrjob, Caffeine, S4, MapR,
Acunu, Flume, Kafka, Azkaban, Oozie, Greenplum
• Storage
◦ S3, Hadoop Distributed File System
• Servers
◦ EC2, Google App Engine, Elastic, Beanstalk, Heroku
• Processing
◦ R, Yahoo! Pipes, Mechanical Turk, Solr/Lucene, ElasticSearch,
Datameer, BigSheets, Tinkerpop
Internet of Things (IoT)
• Devices that collect and
transmit data over the
internet.
• Include technologies such as
smart homes, intelligent
transportation and smart
cities
• MIMOS Berhad (Malaysia's
national R&D centre in ICT
under purview of the
Malaysian Ministry of
Science, Technology and
Innovation (MOSTI) ) has
developed a Strategic Image Source: http://www.wordstream.com/blog/ws/2015/01/09/the-internet-of-things
Roadmap for IoT in Malaysia http://mimos.my/iot/roadmap2.html
Source: http://www.redtone.com/malaysia-unveils-iot-roadmap-expects-us11b-income-boost/
Source: http://2011trabzon.com/wp-content/uploads/2016/09/internet-of-things-examples-aiulgfp0.jpg
Your phone has been recording exactly where
you’ve been and how long you spent there.

https://www.buzzfeed.com/jimwaterson/your-iphone-knows-exactly-where-youve-been-and-this-is-how-
t?utm_term=.fmlGLZyWd#.np6dm85VQ
http://www.pcworld.com/article/2907061/4-ways-your-android-device-is-tracking-you-and-how-to-stop-it.html
Ok Google / Siri

Google Analytics tracks and reports website traffic.

• “Google Suggest” or “Autocomplete”


• Based on popularity
Source: http://paultan.org/
Data-Driven Geography
• More than 80% of the data kept by organizations worldwide
has a location component.
• By combining location-related data with other data,
organizations can gain critical insights, make better decisions,
and optimize important processes and applications.
• A data-driven geography may be emerging in response to the
wealth of georeferenced data flowing from sensors and
people in the environment.
Sources of geographically (and often
temporally) referenced data:
• Location-aware technologies such as the Global Positioning System
and mobile phones
• In-situ sensors carried by individuals in phones, attached to vehicles,
embedded in infrastructure, remote sensors by airborne and satellite
platforms, radiofrequency identification (RFID) tags attached to
objects
• Georeferenced social media
GIS and BIG DATA???
• Traditional GIS systems are often insufficient
for meaningful interpretation
• Collection of Geospatial Big Data
• Geographically-enabled social media
- Facebook, Twitter, Instagram
• As a geospatial expert, we are very interested in the location
component of data
• Able to answer and forecast questions such as WHERE? WHEN?
– Spatio-temporal analytics
WHY?
• Taps huge datasets for policy measures – GIS tools for Big Data
processing facilitate deep insights and predictive modelling for policy
making in health care, crime detection, disaster response and more.

• Supports spatial analysis of unstructured data in real-time – Maps


integrate unstructured data (e-mails, blogs, social media content, in-
store sensor data, meteorological data, driving times, etc.) in real-time.
This is useful for location analysis in retail, finance and insurance.

• Integrates multiple data layers for a complete picture – Huge amount


of data is pulled from different formats, devices or systems and given a
geographic context for a complete picture or analysis.
(cont..)
• Empowers Business Intelligence (BI) approach to businesses – The
convergence of Big Data and mapping leads to deeper insights,
profitability, lesser time-to-market, improved customer engagement
and better ROI.

• Enables spatiotemporal queries over big geospatial data – Case-by-


case query processing and data mining of huge spatiotemporal data is
possible for different projects.
SAMPLE ANALYTICS
Areas where geospatial technology has applied Big Data for enhanced analysis:
• Climate modelling and analysis
• Location analytics
• Retail and E-commerce
• Intelligence gathering
• Terrorist financing
• Aviation industry
• Disease surveillance
• Disaster response and early warnings
• Political campaigns and elections
• Banking
• Insurance and Fraud analysis
Greetings from
London
• Mapping of how you’d
say ‘hello’ in the most
frequently spoken
language aside from
English
• Each ‘hello’ has been
scaled to show the
percentage of people
in each area who use it
Source: http://mappinglondon.co.uk/2016/greetings-from-london/
• Vessels tracking and identification of illegal vessels/ immigrants
• Spatio-temporal analysis for congestion detection --making use of
voluminous traffic data to help alleviate congestion
3D Visualisation of traffic data
• California based data artist
Washington DC
Eric Fischer used data form
Twitter and Gnip to create
these heatmaps of major
cities around the world
illustrating the great divide
between where the
tourists and locals.
• Red represents tweets from
tourists while blue
symbolizes local tweets.
Source: http://www.rsvlts.com/2015/03/04/twitter-heatmaps/
Space-time scan statistics for crime pattern detection
Travel mode detection using GPS tracks of two months period
Conclusion
• 3 V’s of Big data- Velocity, Volume,
and Variety
• Mapping and analysis become
further complicated with the
explosion of disruptive
technologies like the cloud,
embedded sensors, mobile and
social media.
• IoT is a convergence of smart
devices that generate data
through internet.
• Data has spatial component
• How can you make use of Big Data
in your organization?
References/ Further Reading
• GIS in the Era of Big Data [ https://cybergeo.revues.org/27647]
• Looking Forward Again: Four Thoughts on the Future of GIS in 2015 and Beyond
[ http://www.esri.com/esri-news/arcwatch/0215/four-thoughts-on-the-future-
of-gis-in-2015-and-beyond ]
• Empowering GIS big data [ https://www.gislounge.com/empowering-gis-big-
data/ ]
• Miller, H. J., & Goodchild, M. F. (2015). Data-driven
geography. GeoJournal, 80(4), 449-461.
• IoT Strategic Roadmap
[http://www.mimos.my/iot/National_IoT_Strategic_Roadmap_Summary.pdf]
• Big Data Tutorial –[ http://www.planet-
data.eu/sites/default/files/presentations/Big_Data_Tutorial_part4.pdf ]

Common questions

Powered by AI

Data visualization plays a crucial role in enhancing human understanding of big data in geospatial contexts by providing intuitive and interactive representations that highlight patterns, such as outliers and clusters, which might be missed in raw data analysis. Techniques include heat maps, 3D visualization, and real-time data dashboards that aid analysts in making sense of complex datasets efficiently .

The veracity of geospatial big data is critical as it often originates from unverified sources with varying levels of accuracy, which can significantly affect the quality of analysis results. Improving data veracity involves quality assessment and statistical methods to enhance analysis accuracy, ensuring that insights derived from big data are reliable and actionable .

Location-aware technologies, such as GPS, mobile phones, and RFID tags, have enriched geospatial analysis by providing precise spatial data that can be integrated with other datasets to gain comprehensive insights. This enhancement allows various sectors, such as healthcare, disaster response, and business intelligence, to make informed decisions based on real-time spatial data .

Traditional GIS is often insufficient for interpreting big geospatial data due to its limitations in handling the volume, variety, and velocity of such data. Alternative solutions include advanced GIS tools designed for big data processing, which facilitate real-time analysis, integration of multiple data layers, and better prediction models necessary for comprehensive spatial analysis .

The Internet of Things (IoT) impacts data collection and analysis in smart cities and intelligent transportation systems by enabling the collection and transmission of vast amounts of data through interconnected devices. This data can be analyzed to improve urban planning, traffic management, and resource optimization, thereby enhancing the efficiency and sustainability of urban spaces .

Geographic Information Systems (GIS) allow users to visualize, question, analyze, and interpret spatial data, which helps in understanding relationships, patterns, and trends that occur spatially. GIS is unique in its ability to handle and analyze data based on its spatial attributes, making it crucial for applications that require geographic insights .

The convergence of Big Data and GIS empowers businesses by providing deeper insights through spatial data analysis and mapping, improving profitability, reducing time-to-market, and enhancing customer engagement. This integration impacts decision-making processes by enabling more accurate and data-driven strategies, optimizing operations, and providing competitive advantages .

Technological advances such as cloud computing and cloud storage have emerged to efficiently access and process geospatial big data, addressing challenges related to the scalability and timely analysis of large datasets. These technologies mitigate issues of data curation, storage, and processing speed, enabling more effective management and analysis of big data for extracting meaningful insights .

Big data analytics integrates complex mathematical operations, such as machine learning, clustering, and trend detection, specified through linear algebra on array data. These integrations address challenges related to the capture, curation, analysis, and visualization of large datasets by enabling advanced computational techniques to efficiently extract insights from complex data structures .

Spatiotemporal analytical capabilities benefit predictive modeling in policy making by allowing the examination of data patterns over time and space. This enhances the ability to forecast future trends and impacts, aiding in the formulation of informed decisions in healthcare, crime prevention, disaster response, and urban planning .

You might also like