0% found this document useful (0 votes)
55 views8 pages

Running Head: Data Warehousing, Big Data, and Green Computing 1

This document discusses three topics: data warehousing, big data, and green computing. It describes data warehousing as specially designed databases for data analysis and querying that contain large stores of historical data. It outlines the typical components of a data warehouse architecture. It then discusses big data in terms of the three V's - volume, variety, and velocity. Finally, it defines green computing as the efficient and environmentally sustainable use of computing resources through methods like renewable energy and reducing hazardous materials.

Uploaded by

Alex Mzirai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views8 pages

Running Head: Data Warehousing, Big Data, and Green Computing 1

This document discusses three topics: data warehousing, big data, and green computing. It describes data warehousing as specially designed databases for data analysis and querying that contain large stores of historical data. It outlines the typical components of a data warehouse architecture. It then discusses big data in terms of the three V's - volume, variety, and velocity. Finally, it defines green computing as the efficient and environmentally sustainable use of computing resources through methods like renewable energy and reducing hazardous materials.

Uploaded by

Alex Mzirai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Running Head: DATA WAREHOUSING, BIG DATA, AND GREEN COMPUTING 1

Data Warehousing, Big Data, and Green Computing

Student Name

Institutional Affiliation

Date
DATA WAREHOUSING, BIG DATA, AND GREEN COMPUTING 2

Introduction

The world is moving fast towards a situation where data and associated information

reign supreme over every other resource known to man. The demand for relevant, timely data

that is stored, optimized, and queried in a timely fashion far outweighs the demand for any

other type of technological resource. The massive datasets generated by the numerous

companies and organizations across the globe require a method of storage and efficient

querying in the world of big data. This is where data warehouses and data marts come into

play. As these huge quantities of data need enormous computing power to crunch through, it

is important to have environmentally friendly processing centers, which form the core of

green computing.

Data warehouse architecture

There are specially designed relational databases that are uniquely optimized for

querying and analysis of data, as opposed to transaction processing operations. These

relational databases contain massive stores of historical data which is obtained from

transaction data. Referred to as “data warehouses”, these databases can also include data

obtained from various foreign sources [ CITATION Ari10 \l 1033 ]. The main objective in

developing data warehouses is to separate workload for analysis from workloads designed for

transactions. This configuration helps an organization to assimilate data from multiple foreign

sources. Consistently, the concepts of extraction, transformation, transportation, and loading

(ETL), combined with an efficient Online Analytical Processing (OLAP) engine, feature in

modern data warehouse designs.

In both housing and software architectures, there is a proper and efficient arrangement

of elements to create a finished system in which all components work harmoniously. Data

warehouses are no exception, and they consist of both hardware and software components.
DATA WAREHOUSING, BIG DATA, AND GREEN COMPUTING 3

Different organizations create their data warehouses using various configurations, each

uniquely suited to the needs of the particular organization. The modular design allows for

future enhancement of each module with extra services and tooling. The final design often

depends on the circumstances in which the organization needs to build up a data warehouse.

Verified components of a functional data warehouse include source data component, data

staging components, data storage components, information delivery components, and

metadata.

 Source data components include data coming into the functional data

warehouse and can be arranged into four broad categories, including

production data (from operating systems), internal data (from spreadsheets,

customer profiles, and reports), archived data (from old data), and external

data (from industry statistics).

 Data staging components are utilized to prep the data after it is collected from

various sources and operational systems. The data at this stage is changed,

manipulated, converted, and readied by transforming it into a suitable format

that can be stored in a database for query and analysis. Extraction involves

obtaining the data from various data sources. Transformation involves

cleaning errors, missing data, and duplicates in the data. Transformation also

involves standardizing data, and lastly loading the data into a data warehouse

then going live.

 Data storage components are built for operational systems that contain current

data. This data is usually efficiently normalized to make its processing faster.

 The information delivery component enhances the subscription to data

warehousing and then moving it to multiple destinations based on an efficient

scheduling algorithm e.g. via intranets, email, and internet.


DATA WAREHOUSING, BIG DATA, AND GREEN COMPUTING 4

 Metadata components consist of data dictionaries and data catalogs in the

DBMS. Data about logical data structures, addresses, information about

indices, and records are stored in the dictionaries.

Data transformations include:

 Format revisions – standardizing the data type and length of the fields.

 Calculated/ decoded values – for example, finding the total cost and profit

margins (calculated values) and customer age (derived value).

 Splitting of single fields e.g. first name, last name, and middle name.

 Merging information – developing a connection between separate fields e.g.

price, package type, description, etc.

In modern times, data warehousing is migrating to managed serverless platforms such

as Google’s BigQuery and Amazon’s Redshift. This cloud-based deployment is efficient and

perfect for eliminating complexity and cost constraints associated with data warehouses. A

majority of cloud operations are automated in the whole data warehouse lifecycle. Machine

learning and Artificial Intelligence are helping to improve accessibility and data-driven

processing.

Big Data

In modern enterprise computing, big data as a concept can be broken down into three

sub-concepts, each presenting a unique challenge in the area of data management. These sub-

concepts include variety of data, velocity, and increasing volume. None of these problems

(the three V’s of big data) can be solved using traditionally implemented databases [ CITATION

Zik11 \l 1033 ]. Regarding big data, the sheer volume of data can range from a few terabytes to

massive petabytes of data. The variety of big data consists of data from multiple sources

encoded in various formats. These include online transactions, social media interactions, web
DATA WAREHOUSING, BIG DATA, AND GREEN COMPUTING 5

logs, and financial transactions. The velocity of big data refers to the speed at which

businesses generate data and deliver actionable insights to end users, meaning that the data is

collected, processed, and analyzed within a short time.

In my experience in enterprise computing, I have observed a highly successful e-

commerce company utilize the concepts of big data to perform analytics operations on data

obtained from sales and customer interactions. The business efficiently profiled customers

and enabled the sales teams to engage each customer in real-time. This direct conversation

manner appealed to the customers and generated a lot of repeat business. The e-commerce

company was also able to efficiently collect and process user feedback. This process helped

the executives understand how their customers view the company’s services and products.

Through feedback analysis, the company was able to redevelop its products and conduct fast

risk analysis.

Regarding organizations and data management technology, big data is forcing

businesses and IT environments to evolve. Organizations have to automate processes in order

to keep up with the massive amount of data generated and analyzed by the business on a daily

basis. Data management is moving to the cloud as more companies look toward infrastructure

that is scalable and costs less than regular on-premises deployments. Organizations are also

forced to hire qualified personnel to carry out complex analytical operations using various

business intelligence and analytics tools. In addition to this, there is a growing need for

organizations to have IT staff who are competent in cloud-based technologies and big data

platforms such as Hadoop and Spark.

Green Computing

The term green computing is used in reference to the efficient use of computing assets

in an eco-friendly manner that maintains environmental sustainability. The goals of green


DATA WAREHOUSING, BIG DATA, AND GREEN COMPUTING 6

computing include recyclability of dead IT components and factory waste, minimal use of

hazardous components in manufacture of computers, and enhancement of energy efficiency

in data centers. Renewable energy sources which are freely available and produce almost zero

pollution form the basis of the entire green computing movement. The best examples of green

computing data centers run on geothermal power, wind turbine energy, and solar power. The

stated goals of the drive for green computing include design of the right algorithms to

enhance processor efficiency, decreasing the use of environmentally hazardous materials, and

promoting the use of recyclable components to build computers [ CITATION Ron161 \l 1033 ].

Organizations with a need to build “green” data centers can take suitable steps

towards the goal, including:

1. Using ecofriendly materials that are easy to recycle and leave a minimal carbon

footprint. The company can also use alternative power sources for server shutdowns,

paper shredding, and air compressor upgrades.

2. Enhancing data center cooling by locating company data centers in cold environments

such as Alaska. Heat can also be siphoned out of the buildings.

3. Implementing modular data centers which target unique parameters such as

performance, speed, cost reduction, and reliability. Shipping containers feature

heavily in design of modular data centers for green computing.

4. Minimizing power usage at the data center, which is the significant driver of variable

costs in the data center. It is estimated that for each functional data center, power

costs take up about 10% of recurring costs. This requires a drastic reduction in the

quantities of energy used to power IT installations, of which 60% goes to running

servers. This situation can be managed by replacing old servers, getting rid of

dormant servers, and virtualizing workloads.


DATA WAREHOUSING, BIG DATA, AND GREEN COMPUTING 7

5. Conducting baseline energy audits provides the organization with a real-time outline

of how energy is utilized at the data centers. These outlines form the benchmark for

future audits and long term planning as the company aims to reduce energy

consumption.

An organization that continues to apply the concepts of green computing with remarkable

success is Google. The company specializes in purchasing land in remote areas where land

prices are cheap, thus lowering the overall cost of running the data centers [ CITATION Mat07 \l

1033 ]. The energy sources that Google uses are cheap and highly available, such as wind,

solar, and geothermal energy solutions. The company has also worked out the reality of how

much energy is needed to power large server farms, and has resorted to only environmentally

useful solutions. Google also works to actively reduce the heat output of each of its servers

across all its data centers [ CITATION Bel12 \l 1033 ].

Conclusion

The concepts of green computing, big data, and data warehousing are ubiquitous in

the world of modern enterprise computing. Green computing requires that all organizations

with large scale IT infrastructure utilize ecofriendly techniques and materials in establishing

any data centers. Big data consists of the combined points of variety of data (from many

sources), increased volume (more and more data is produced on a daily basis), and increased

speed of data production by companies and individuals across the globe. Warehousing of data

involves designing, building, and maintaining relational databases uniquely optimized for the

storage, querying, and analysis of historical and current data in a variety of contexts in order

to yield actionable intelligence that can drive business decisions.


DATA WAREHOUSING, BIG DATA, AND GREEN COMPUTING 8

References

Ariyachandra, T., & Watson, H. (2010). Key organizational factors in data warehouse

architecture selection. Decision support systems, 200-212.

Beloglazov, A., Abawajy, J., & Buyya, R. (2012). Energy-aware resource allocation

heuristics for efficient management of data centers for cloud computing. . Future

generation computer systems. , 755-768.

Rong, H., Zhang, H., Xiao, S., Li, C., & Hu, C. (2016). Optimizing energy consumption for

data centers. Renewable and Sustainable Energy Reviews, 674-691.

Wheeland, M. (2007, May 2). Green Computing at Google. Retrieved from Greenbiz.com:

https://www.greenbiz.com/news/2007/05/02/green-computing-google

Zikopoulos, P., & Eaton, C. (2011). Understanding big data: Analytics for enterprise class

hadoop and streaming data. McGraw-Hill Osborne Media.

You might also like