Dev Unit 4
Dev Unit 4
28,30 | 35,40) ‘The two numbers in the middle are 30 and 35. ‘ou add them and divide them by two, and the result is: Q3= (0+ 35y2.Q3 = 654 282 Q3=325 27, How to calculate the [OR it an even dataset The formula for calculating IQR is exactly the same as the one we used to calculate it for the odd dataset, TOR =Q3-QI IQR =325- 175 1QR=15 28, How to find at outlier in an even dataset Asa recap, so far the five number summary is the fallow MIN = 10 To calculate any outliers in the da cateQ3 + 1 To find any ldv ets, you calcualte QI - L.S(IQR) and see if there are any values fess than the fey outliers | outligr <, tlier Phere arentt any: values in the dituset thal are less than -5, Finally, to find any higher outliers, you calculate Q3 - | 5(IQR) and see if there are iny values in the abstaset that are higher than the result outlier > 32.5 + 15(15 = outlier > 32.5 + 22.5 29‘oulligr > 35 There aren't any values higher than 55 so this dataset doesn't have any outliers. 29. What is a Multiple Boxplot? Mohiple bosplots are quite useful charts when it eames ta visualize several grouns oF calegories, their median und varisbility, all at once Q hg Chart Title Co? Moats Moan Hi oaue ° Ly, 30, What is the T test? e . Atacstisa carsneh J roups. It is offen used in hypothesis esting ta helher process or treatment actually has an effect om the population cethef two groups are different from one another: QP ala 30JT. Whew to use a erest? A titest can only be used when comparing the means of two groups (aka. pairwise comparison). If you want to compare more than two groups, of if you want 10 do moltiple pairwise comparisons, use an ANOVA test of a post-hoe test Forwur az Yeo ‘ 32. Explain the correlation co-efficient bycouy util with example. T ACOR| 2 API ED ADA: r he we concentrate on the type of correlation coeficient, c relationship between pairs of variables for quantitative A correlation coefficient ist pares of variab! tnd ie designated as r. th data Pearson Corre ent (1): A number berween -1,00 and +1.00 that deseribes the Imear rel; hits G Pquantitative vanables I has the following property. ign ofr; A number with a plus sign (or no sign) indicules @ positive relations, and i number vith a mirws sign indicates a negative relations. Nainerical Value af r: The more closely a value of r approaches either 1.00 or + 1.00, the stronger (more regular} the relationship, Conversely, the more closely the value of r approaches 6, the weaker (less regular) the relationship. 31Interpretation of rz Located along a scale from —1.00 to +1.00, the value of ¢ supplies information about the direction of a linear selationship—whether positive or negative—and, generally, information about the relative strength ofa lineur relanonship whether relatively weak (and a poor describer of the datn) because fis in the vicinity of 0, or relatively strong (and a good describer of the data) because r deviates from 0 in ihe duzestion of either + 1,00 or —1,00, r Is Independent of Units of Measurement The vilue ar is independent af (he oTiginul units ef meusurement. In fart, the same deseribes the correlation betwuen height und weight for u group of adults, regard height is measured in inches or centimeters or whether weight is measured in pout Verbal Descriptions ‘When interpreting a brand new r, you'll find it helpful to translate the verbal description athe relationship. Ant of 70 fur the height and Welkkf Gr college sutlents could be (anslated into "Taller students tend to weigh more” (ot somMé other equally valid statement, stich as “Lighter students tend to be shorter"), amr of —42 foktime spent taking an exam und the subsequent exam scare conld be translate, Suen who take Tess lime tend: to make higher scores": and an r in the neighborhgéd of b for shoe size and IQ could be translated into “Little, if any, relationship exists between shoe gize af@1Q.” Correlation Not Necessarily Cause-Effect A correlation coeficient, reyardless of sid observed relationship reflects a simp alfirs. ides information shout Whether an inship or some more complex state of Givema correlation’ thal poverty causes erithe Ps ities, you can speculate mi degree of ineviiahihty ccording to this view, any widespread reduetion m pevery should common cause s' some cumbinatk widespread redu and crime According to Lis view, @ yy should have no effect on crime. Which speculation is correct? nol be resatved merely on the basis of an observed correlation, 22Height (¥) FIGURE 6.5 Effect of cange restrictiom om the value of r COMPUTATION FORMULA FOR r 33Calculate a value for r by using the following computation formula: CORRELATION COEFFICIENT (COMPUTATION FORMULA) SP. where the two sum of squares terms in the denominator are defined as ss,-5(x-a)ex 2 Pricrors the tong of the re atonstip stronger retasionships larger positive or negative sumy of products. Table 6.2 illustrates for the original greeting card data by using the computation formula. a4Table 6.3 CALCULATION OF r: COMPUTATION FORMULA ‘Assign a value to a(1), representing the number ol pairs of scores. Sum all scores for X (2) and for ¥9), Find the product of each pair of Xand Yscores (), one at atime, then add all of t products G). Square each X score (6), one at a time, then add all squared Xscores (7), Square each Yscore (¥}, one at a ime, then add all squared ¥ scores @ Substitute rurmbers into formulas (1D) and solve for SP, SS, and SS, Substitute into formula (11) and solve for 1, |. DATA AND COMPUTATIONS CARDS 4 FRIEND «SENT, = RECEIVED. ¥ wy 13 14 3534, Explain the Scatterpiots and Resistant Line. Seatter plots are the graphs that present the relationship hetween two variables in a datas set. It represents data points on a (wo-dimensianal plane or on a Cartesian system. Thy independent variable or atrribute is platted on the X-axis, while the dependent variabl plotted on the Y-axis. ‘These plots are often called scatter graphs or seater diagrams. Scarcer plots insiantly report n large volume of data It is beneficial in the follows = Fora large set of data points given + Fach set comprises a pair of values © The given data is in numeric form The line drawn ina seatler plol, which is neur to almost all {he points i plot is known as“‘line of best fit” or “trend line*. XandY Values P Scatter plot Corretatioy Z APP tistical measure of the relationship herween the nwo. ements: If the variables are correlated, the points will fall glong a the correlation, the closer the points will touch the lune. “Ihis cause difed as one of the seven essential quality toals, ce [he scatter plot explains the correlation between two attributes or variables. It represents how closely the two vanables are connected, ‘Vhere can be three such situations to see the relation between the two variables — 1, Positive Correlation 2, Negative Comrclatian 3. No CorrelationPositive Correlation ‘When the points im the graph are rising, may ing Irom left to right, then the seatter pla shows a positive correlation. It means the values of one variable are increasing with resp; to another. Now positive correlation can further ke classified ime three categories; + Perfect Positive — Which represents a perfectly straight line + High Positive — All points are nearby: Iw Positive — When all the points are scaltered Perfect positive High positive correlation correlation Negative Correlation negative conelat another ‘These are alo Highnegative Low negalive correlation correlation (o Correlation ‘When the points are scattered ill over the graph and itis dificult to conctuxte whether the values are inereasing or decteasing. then there is no eortelation between the a7‘variables, Seatter plot Example Let us understand how to construct a seatter plot with the help of the below example, Example: Draw at scalter plot for the given aa that bEr OF eames played and scores obtained in each instance. No ofgames3 5.2671 mesg Scares Z App Solution: Rewers of hen games Y-axis : Soores ogy the, aph will be: a Hoe ot s a 7. Number of gamen 3aResistant Line A Resistange line, sometimes also known as u Speed Line, helps identify stock trends and levels of support and resistances. Resistance lines are technical indication tools used by equity analysis and investor: determine the price trend af a specific stock, They are very useful in predi Probable movement of stock prices and belping people invest in the right stock. Resistance lines are usually drawn an a high-lo-low basis, ‘They help estins and support levels, making them a very useful tool in wading, 4 resistance Line yn an uptrend movement marks the suppemt area adowwntrend movement marks the resistance area “The three lines in the graph helow indicale a downirend mavement a of them will help lead to a sound investment decision, st Cf in a stock chart to make predictions, 5. What if Transformation? aca transformation refers to application of a function to each item in a data set. Heres! is haced by ils transfarmned value 37 where y=) franslormations are carnied oul generally to make appearance of graphs more: interpretable, There are four major functions used for transformations, 39fogx- logarithm transformations. Log transformation is a data transformation method in which it replaces cach variable x with a log(x)For example sound units are in decibels and is generally represented using lag trans formitlions, {e+ Reciprocal Transformations, A transformation of raw data that involves (a) replacing jonigmal dala units with (heir reciprecats amd (b) analyzing the modilied dala Wt ean be with nonzero data and is commoiily used when distributions have skewness or clear out Unlike other transformations, a reciprocal tansformation changes the order of the 0f Also called inverse transformation, For example ‘Time ta complete race’ task, using speed, More the speed lesser the time taken “x Square root Transformations, This consists of taking the square\goo! The back transformation is to square the number, If you have negativ the syne rant, you should add constant tw each number to m: example arcas of circular ground are compared using their radius. 2 Power ‘ranstinrmations Power (ransfom js « family using power laws. The idea is to apply a transformation What's the purpose of 1 pawer trsinsform? distribution of the features, If a features is asy make it more symmetric fe the symmeiry af the power transformation will logarithm and Square reat Transform Reciprocal and Pawer Transformat pesilive numbers: Following diagrams ake graphically. se of positive numbers where a3 case of both negative as well as Before transfo eden ean passe 40