Running Head: Data Warehousing, Big Data, and Green Computing 1
Running Head: Data Warehousing, Big Data, and Green Computing 1
Student Name
Institutional Affiliation
Date
DATA WAREHOUSING, BIG DATA, AND GREEN COMPUTING 2
Introduction
The world is moving fast towards a situation where data and associated information
reign supreme over every other resource known to man. The demand for relevant, timely data
that is stored, optimized, and queried in a timely fashion far outweighs the demand for any
other type of technological resource. The massive datasets generated by the numerous
companies and organizations across the globe require a method of storage and efficient
querying in the world of big data. This is where data warehouses and data marts come into
play. As these huge quantities of data need enormous computing power to crunch through, it
is important to have environmentally friendly processing centers, which form the core of
green computing.
There are specially designed relational databases that are uniquely optimized for
relational databases contain massive stores of historical data which is obtained from
transaction data. Referred to as “data warehouses”, these databases can also include data
obtained from various foreign sources [ CITATION Ari10 \l 1033 ]. The main objective in
developing data warehouses is to separate workload for analysis from workloads designed for
transactions. This configuration helps an organization to assimilate data from multiple foreign
(ETL), combined with an efficient Online Analytical Processing (OLAP) engine, feature in
In both housing and software architectures, there is a proper and efficient arrangement
of elements to create a finished system in which all components work harmoniously. Data
warehouses are no exception, and they consist of both hardware and software components.
DATA WAREHOUSING, BIG DATA, AND GREEN COMPUTING 3
Different organizations create their data warehouses using various configurations, each
uniquely suited to the needs of the particular organization. The modular design allows for
future enhancement of each module with extra services and tooling. The final design often
depends on the circumstances in which the organization needs to build up a data warehouse.
Verified components of a functional data warehouse include source data component, data
metadata.
Source data components include data coming into the functional data
customer profiles, and reports), archived data (from old data), and external
Data staging components are utilized to prep the data after it is collected from
various sources and operational systems. The data at this stage is changed,
that can be stored in a database for query and analysis. Extraction involves
cleaning errors, missing data, and duplicates in the data. Transformation also
involves standardizing data, and lastly loading the data into a data warehouse
Data storage components are built for operational systems that contain current
data. This data is usually efficiently normalized to make its processing faster.
Format revisions – standardizing the data type and length of the fields.
Calculated/ decoded values – for example, finding the total cost and profit
Splitting of single fields e.g. first name, last name, and middle name.
as Google’s BigQuery and Amazon’s Redshift. This cloud-based deployment is efficient and
perfect for eliminating complexity and cost constraints associated with data warehouses. A
majority of cloud operations are automated in the whole data warehouse lifecycle. Machine
learning and Artificial Intelligence are helping to improve accessibility and data-driven
processing.
Big Data
In modern enterprise computing, big data as a concept can be broken down into three
sub-concepts, each presenting a unique challenge in the area of data management. These sub-
concepts include variety of data, velocity, and increasing volume. None of these problems
(the three V’s of big data) can be solved using traditionally implemented databases [ CITATION
Zik11 \l 1033 ]. Regarding big data, the sheer volume of data can range from a few terabytes to
massive petabytes of data. The variety of big data consists of data from multiple sources
encoded in various formats. These include online transactions, social media interactions, web
DATA WAREHOUSING, BIG DATA, AND GREEN COMPUTING 5
logs, and financial transactions. The velocity of big data refers to the speed at which
businesses generate data and deliver actionable insights to end users, meaning that the data is
commerce company utilize the concepts of big data to perform analytics operations on data
obtained from sales and customer interactions. The business efficiently profiled customers
and enabled the sales teams to engage each customer in real-time. This direct conversation
manner appealed to the customers and generated a lot of repeat business. The e-commerce
company was also able to efficiently collect and process user feedback. This process helped
the executives understand how their customers view the company’s services and products.
Through feedback analysis, the company was able to redevelop its products and conduct fast
risk analysis.
to keep up with the massive amount of data generated and analyzed by the business on a daily
basis. Data management is moving to the cloud as more companies look toward infrastructure
that is scalable and costs less than regular on-premises deployments. Organizations are also
forced to hire qualified personnel to carry out complex analytical operations using various
business intelligence and analytics tools. In addition to this, there is a growing need for
organizations to have IT staff who are competent in cloud-based technologies and big data
Green Computing
The term green computing is used in reference to the efficient use of computing assets
computing include recyclability of dead IT components and factory waste, minimal use of
in data centers. Renewable energy sources which are freely available and produce almost zero
pollution form the basis of the entire green computing movement. The best examples of green
computing data centers run on geothermal power, wind turbine energy, and solar power. The
stated goals of the drive for green computing include design of the right algorithms to
enhance processor efficiency, decreasing the use of environmentally hazardous materials, and
promoting the use of recyclable components to build computers [ CITATION Ron161 \l 1033 ].
Organizations with a need to build “green” data centers can take suitable steps
1. Using ecofriendly materials that are easy to recycle and leave a minimal carbon
footprint. The company can also use alternative power sources for server shutdowns,
2. Enhancing data center cooling by locating company data centers in cold environments
4. Minimizing power usage at the data center, which is the significant driver of variable
costs in the data center. It is estimated that for each functional data center, power
costs take up about 10% of recurring costs. This requires a drastic reduction in the
servers. This situation can be managed by replacing old servers, getting rid of
5. Conducting baseline energy audits provides the organization with a real-time outline
of how energy is utilized at the data centers. These outlines form the benchmark for
future audits and long term planning as the company aims to reduce energy
consumption.
An organization that continues to apply the concepts of green computing with remarkable
success is Google. The company specializes in purchasing land in remote areas where land
prices are cheap, thus lowering the overall cost of running the data centers [ CITATION Mat07 \l
1033 ]. The energy sources that Google uses are cheap and highly available, such as wind,
solar, and geothermal energy solutions. The company has also worked out the reality of how
much energy is needed to power large server farms, and has resorted to only environmentally
useful solutions. Google also works to actively reduce the heat output of each of its servers
Conclusion
The concepts of green computing, big data, and data warehousing are ubiquitous in
the world of modern enterprise computing. Green computing requires that all organizations
with large scale IT infrastructure utilize ecofriendly techniques and materials in establishing
any data centers. Big data consists of the combined points of variety of data (from many
sources), increased volume (more and more data is produced on a daily basis), and increased
speed of data production by companies and individuals across the globe. Warehousing of data
involves designing, building, and maintaining relational databases uniquely optimized for the
storage, querying, and analysis of historical and current data in a variety of contexts in order
References
Ariyachandra, T., & Watson, H. (2010). Key organizational factors in data warehouse
Beloglazov, A., Abawajy, J., & Buyya, R. (2012). Energy-aware resource allocation
heuristics for efficient management of data centers for cloud computing. . Future
Rong, H., Zhang, H., Xiao, S., Li, C., & Hu, C. (2016). Optimizing energy consumption for
Wheeland, M. (2007, May 2). Green Computing at Google. Retrieved from Greenbiz.com:
https://www.greenbiz.com/news/2007/05/02/green-computing-google
Zikopoulos, P., & Eaton, C. (2011). Understanding big data: Analytics for enterprise class