Seminar – Symbolic Data Analysis Representing and Analysing Data with Variability (Prof Paula Brito)

You are invited to attend the April Meeting of the NSW Branch.

Date:  Tuesday, 5 April 2016

6:00pm – 6:30pm: Refreshments
6:30pm – 7:30pm: Lecture
7:45pm onwards: Dinner (at a nearby restaurant)

CB04.05.430. Level 5, University of Technology, Sydney – Building 4, 745 Harris Street, Broadway, NSW 2007.

Paula Brito

Faculdade de Economia / LIAAD – INESC TEC
University of Porto, Portugal

Symbolic Data Analysis
Representing and Analysing Data with Variability

(The slides for this talk can be viewed here.)

Symbolic Data, introduced by E. Diday in the late eighties of the 20th century, is concerned with analysing data presenting intrinsic variability, which is to be explicitly taken into account. In classical Statistics and Multivariate Data Analysis, the elements under analysis are generally individual entities for which a single value is recorded for each variable – e.g., individuals, described by their age, salary, education level, marital status, etc.; cars each described by its weight, length, power, engine displacement, etc.; students for each of which the marks at different subjects were recorded. But when the elements of interest are classes or groups of some kind – the citizens living in given towns; car models, rather than specific vehicles; classes and not individual students – then there is variability inherent to the data. To reduce this variability by taking central tendency measures – mean values, medians or modes – obviously leads to a too important loss of information.Symbolic Data Analysis provides a framework allowing representing data with variability, using new variable types. Also, methods have been developed which suitably take data variability into account. Symbolic data may be represented using the usual matrix-form data arrays, where each entity is represented in a row and each column corresponds to a different variable – but now the elements of each cell are generally not single real values or categories, as in the classical case, but rather finite set of values, intervals or, more generally, distributions.

In this talk we shall introduce and motivate the field of Symbolic Data Analysis. We present the new variable types that have been defined to represent variability, illustrating with some examples, and considering data representation models for some cases. We shall furthermore discuss issues that arise when analysing data that does not follow the usual classical model. Time allowing, some multivariate data analysis methods will be presented. Software for the analysis of symbolic data shall be mentioned.

Biography of Paula Brito

Paula Brito is Associate Professor in Statistics and Data Analysis at the University of Porto, Portugal. She is an expert in the development of innovative methods for the statistical analysis of multidimensional complex data, including symbolic data analysis, clustering methods and other popular machine learning techniques.
When: 05/04/2016
Time: 6:00 pm - 7:30 pm
Cost: Free

Get the latest posts delivered to your mailbox:

Show Buttons
Hide Buttons