E - Augment

Prev Next

Overview

Data augmentation (enrichment) typically involves adding further valuble information. Consider a lookup table with some economic and demographic data which you want to add to your analysis in order to provide a better statistical weighting. Or for stock market data, obtain a list of present credit ratings and key financial data of the corporation you are tracking. We suggest to use the functions already described in the previous sections.


Wikipedia Example (continued from step 4)

Do some enrichment this table:

  • Specify the plausible surface areas for China and Denmark (i.e. without Greenland and provinces pretending to claim).
  • Add population density and variation between the number of inhabitants from two different sources.

Simple Example

[c1:Country,{China,Denmark,'Timor-Leste'},Area] = {9597000,42952,14950}; // km2
//  (Denmark without Greenland, and CN without China South Sea, and area was missing for East Timor

table insert columns            ( c1, { Inhabitants Variation, Inhabitants per km2 } );
table process selected rows     ( c1, [Inhabitants] == '', [Inhabitants] = [Population]);
table process                   ( c1, [Inhabitants per km2]   = [Inhabitants] / [Area];
                                      [Inhabitants Variation] = ([Inhabitants] - [Population])/[Inhabitants] );
echo( "Table C1: ");
table list                      ( c1, briefly, 4, last col, 2 ); // List just 3 columns and first and last 4 rows

Enrichtment done.
Table C1:
    0 : Country     | Area    | Inhabitants
    1 : Afghanistan | 652230  | 41100000   
    2 : Egypt       | 1001450 | 103500000  
    3 : Albania     | 28748   | 2800000    
    4 : Algeria     | 2381741 | 44900000   
  ... :
  194 : Cyprus      | 9251    | 1300000    
  195 : China       | 9597000 | 1422584933
  196 : Denmark     | 42952   | 5948136    
  197 : Timor-Leste | 14950   | 1384286    

Try it yourself: Open TAB_Features_Enrichment.b4p in B4P_Examples.zip. Decompress before use.