92,99 €
Biplots are a graphical method for simultaneously displaying two kinds of information; typically, the variables and sample units described by a multivariate data matrix or the items labelling the rows and columns of a two-way table. This book aims to popularize what is now seen to be a useful and reliable method for the visualization of multidimensional data associated with, for example, principal component analysis, canonical variate analysis, multidimensional scaling, multiplicative interaction and various types of correspondence analysis. Understanding Biplots: * Introduces theory and techniques which can be applied to problems from a variety of areas, including ecology, biostatistics, finance, demography and other social sciences. * Provides novel techniques for the visualization of multidimensional data and includes data mining techniques. * Uses applications from many fields including finance, biostatistics, ecology, demography. * Looks at dealing with large data sets as well as smaller ones. * Includes colour images, illustrating the graphical capabilities of the methods. * Is supported by a Website featuring R code and datasets. Researchers, practitioners and postgraduate students of statistics and the applied sciences will find this book a useful introduction to the possibilities of presenting data in informative ways.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 534
Table of Contents
Title Page
Copyright
Preface
Chapter 1: Introduction
1.1 Types of Biplots
1.2 Overview of the Book
1.3 Software
1.4 Notation
Chapter 2: Biplot Basics
2.1 A Simple Example Revisited
2.2 The Biplot as a Multidimensional Scatterplot
2.3 Calibrated Biplot Axes
2.4 Refining the Biplot Display
2.5 Scaling the Data
2.6 A Closer Look at Biplot Axes
2.7 Adding New Variables: the Regression Method
2.8 Biplots and Large Data Sets
2.9 Enclosing a Configuration of Sample Points
2.10 Buying by Mail Order Catalogue Data Set Revisited
2.11 Summary
Chapter 3: Principal Component Analysis Biplots
3.1 An Example: Risk Management
3.2 Understanding PCA and Constructing Its Biplot
3.3 Measures of Fit for PCA Biplots
3.4 Predictivities of Newly Interpolated Samples
3.5 Adding New Axes to a PCA Biplot and Defining Their Predictivities
3.6 Scaling the Data in a PCA Biplot
3.7 Functions for Constructing a PCA Biplot
3.8 Some Novel Applications and Enhancements of PCA Biplots
3.9 Conclusion
Chapter 4: Canonical Variate Analysis Biplots
4.1 An Example: Revisiting the Ocotea Data
4.2 Understanding CVA and Constructing Its Biplot
4.3 Geometric Interpretation of the Transformation to the Canonical Space
4.4 CVA Biplot Axes
4.5 Adding New Points and Variables to a CVA Biplot
4.6 Measures of Fit for CVA Biplots
4.7 Functions for Constructing a CVA Biplot
4.8 Continuing the Ocotea Example
4.9 CVA Biplots for Two Classes
4.10 A Five-Class CVA Biplot Example
4.11 Overlap in Two-Dimensional Biplots
Chapter 5: Multidimensional Scaling and Nonlinear Biplots
5.1 Introduction
5.2 The Regression Method
5.3 Nonlinear Biplots
5.4 Providing Nonlinear Biplot Axes for Variables
5.5 A PCA Biplot as a Nonlinear Biplot
5.6 Constructing Nonlinear Biplots
5.7 Examples
5.8 Analysis of Distance
5.9 Functions AODplot and PermutationAnova
Chapter 6: Two-Way Tables: Biadditive Biplots
6.1 Introduction
6.2 A Biadditive Model
6.3 Statistical Analysis of the Biadditive Model
6.4 Biplots Associated with Biadditive Models
6.5 Interpolating New Rows or Columns
6.6 Functions for Constructing Biadditive Biplots
6.7 Examples of Biadditive Biplots: the Wheat Data
6.8 Diagnostic Biplots
Chapter 7: Two-Way Tables: Biplots Associated with Correspondence Analysis
7.1 Introduction
7.2 The Correspondence Analysis Biplot
7.3 Interpolation of New (Supplementary) Points in CA Biplots
7.4 Other CA Related Methods
7.5 Functions for Constructing CA Biplots
7.6 Examples
7.7 Conclusion
Chapter 8: Multiple Correspondence Analysis
8.1 Introduction
8.2 Multiple Correspondence Analysis of the Indicator Matrix
8.3 The Burt Matrix
8.4 Similarity Matrices and the Extended Matching Coefficient
8.5 Category-Level Points
8.6 Homogeneity Analysis
8.7 Correlational Approach
8.8 Categorical (Nonlinear) Principal Component Analysis
8.9 Functions for Constructing MCA Related Biplots
8.10 Revisiting the Remuneration Data: Examples of MCA and Categorical PCA Biplots
Chapter 9: Generalized Biplots
9.1 Introduction
9.2 Calculating Inter-Sample Distances
9.3 Constructing a Generalized Biplot
9.4 Reference System
9.5 The Basic Points
9.6 Interpolation
9.7 Prediction
9.8 An Example
9.9 Function for Constructing Generalized Biplots
Chapter 10: Monoplots
10.1 Multidimensional Scaling
10.2 Monoplots Related to the Covariance Matrix
10.3 Skew-Symmetry
10.4 Area Biplots
10.5 Functions for Constructing Monoplots
References
Index
This edition first published 2011
© 2011 John Wiley & Sons, Ltd
Registered office
John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom
For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com.
The right of the author to be identified as the author of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.
Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book. This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold on the understanding that the publisher is not engaged in rendering professional services. If professional advice or other expert assistance is required, the services of a competent professional should be sought.
Library of Congress Cataloguing-in-Publication Data
Gower, John.
Understanding biplots / John Gower, Sugnet Lubbe, Niel le Roux.
p. cm.
Includes bibliographical references and index.
ISBN 978-0-470-01255-0 (cloth)
1. Multivariate analysis–Graphic methods. 2. Graphical modeling (Statistics) I. Lubbe, Sugnet, 1973- II. le Roux, Niel. III. Title.
QA278.G685 2010
519.5′35–dc22
2010024555
A catalogue record for this book is available from the British Library.
Print ISBN: 978-0-470-01255-0
ePDF ISBN: 978-0-470-97320-2
oBook ISBN: 978-0-470-97319-6
Preface
This book grew from an earlier book, Biplots (Gower and Hand, 1996), the first monograph on the subject of biplots, written in a fairly concentrated and not easily understood style. Colleagues tactfully suggested that there was a need for a friendlier book on biplots. This book is our response. Although it covers similar ground to the Gower and Hand (1996) book, it omits some topics and adds others. No attempt has been made to be encyclopedic and many biplot methods, especially those concerned with three-way tables, are totally ignored.
Our aims in writing this book have been threefold: first, to provide the geometric background, which is essential for understanding, together with its algebraic manifestations, which are essential for writing computer programs; second, to provide a wealth of illustrative examples drawn from a wide variety of fields of application, illustrating different representatives of the biplot family; and third, to provide computer functions written in R that allow routine multivariate descriptive methods to be easily used, together with their associated biplots. It also provides additional tools for those wishing to work interactively and to develop their own extensions.
We hope that research workers in the applied sciences will find the book a useful introduction to the possibilities for presenting certain types of data in informative ways and give them the background to make valid interpretations. Statisticians may find it of interest both as a source of potential research projects and useful examples.
This project has taken longer than we had planned and we are keenly aware that some topics remain less friendly than we might have hoped. We thank Kathryn Sharples, Susan Barclay, Richard Davies, Heather Kay and Prachi Sinha-Sahay at Wiley for both their forbearance and support. We also thank our long-suffering spouses, Janet, Pieter and Magda, if not for their active support, then at least for their forbearance.
John Gower
Sugnet Lubbe
Niël le Roux
www.wiley.com/go/biplots
Chapter 2
Biplot Basics
In accordance with our aim of understanding biplots, the focus in this chapter is to look at biplot basics from the viewpoint of an ordinary scatterplot.
The chapter begins by introducing two- and three-dimensional biplots as ordinary scatterplots of two or three variables. In Section 2.2 biplots are considered as extensions of the ordinary scatterplot by providing for more than three variables. Generalizing, a biplot provides for a graphical display, in at most three dimensions, of data that typically exist in a higher-dimensional space. The concept of approximating a data matrix is thus crucial in biplot methodology. Subsequent sections explore how to represent multidimensional sample points in a biplot, how to equip the biplot with calibrated axes representing the variables and how to refine the biplot display. Emphasis is placed on how to use biplot axes analogously to axes in a scatterplot, that is, for adding new samples to the plot (interpolation) and reading off for any sample point its values for the different variables (prediction). It is then shown how to use a regression method for adding new variables to the plot. Various enhancements to configurations of sample points in a biplot, including how to describe large data sets, are discussed next. Finally, some examples are given, together with the R code for constructing all the graphical displays shown in the chapter. We strongly suggest that readers work through these examples for a thorough understanding of the basics of biplot construction. In later chapters, we provide only the function calls to more elaborate R functions for fine-tuning the various types of biplot.
2.1 A Simple Example Revisited
The data of Table 1.1 are available in the accompanying R package UBbipl in the form of the dataframe aircraft.data. We first convert columns 3 to 6 to a data matrix, aircraft.mat, with row names the first column of Table 1.1 and column names the abbreviations used for the variables in Table 1.1. This is done by issuing the following instructions from the R prompt:
>aircraft.mat<-aircraft.data[,2:5]
>aircraft.mat
SPRRGFPLFSLF
a1.4683.300.1660.10
b1.6053.640.1540.10
.......................
v7.1055.400.0893.20
w8.5484.200.2222.90
Next, we construct a scatterplot of the two variables SPR and RGF with the instructions:
>plot(x=aircraft.mat[,1],y=aircraft.mat[,2],xlab="",
ylab="",xlim=c(0,10),ylim=c(2,6),pch=15,
col="green",yaxp=c(2,6,4),bty="n")
>text(x=aircraft.mat[,1],y=aircraft.mat[,2],
labels=dimnames(aircraft.mat)[[1]],pos=1)
>mtext("RGF",side=2,at=6.4,line=-0.35)
>mtext("SPR",side=1,at=10.4,line=-0.50)
The scatterplot in Figure 2.1 is an example of what is probably the simplest form of an asymmetric biplot. It shows a plot of the columns SPR and RGF, giving performance figures for power and range of the 21 types of aircraft introduced in Table 1.1. It is a scatterplot of two variables referred to orthogonal axes. The familiar elements of Figure 2.1 are:
points representing the aircraft;a directed line for each of the variables, known as a coordinate axis, with its label;scales marked on the axes giving the values of the variables.Figure 2.1 Scatterplot of variables SPR and RGF from the aircraft data in Table 1.1: (top) constructed with default settings; (bottom) constructed with an aspect ratio of unity.
Note also the convention followed of labelling the axes at the end where the calibrations are at their highest values. It is an asymmetric biplot because it gives information of two types, (i) concerning the 21 aircraft and (ii) concerning the two variables, which cannot be interchanged. When a point representing an aircraft is projected orthogonally onto an axis, one may read off the value of the corresponding variable and this will agree precisely with the value given in Table 1.1. Indeed, this is not surprising, because the values of the variables were those used in the first place to construct the coordinate positions of the points. Notice the difference between the top and bottom panels of Figure 2.1. Which of k and n is nearest to j? From the top panel, it appears to be n, but a simple calculation shows the true distances to be
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!