A deeper look into data.table functionalities

Welcome to our new blog, More into data.table. We hope you enjoy reading what we have to say!

Basil Okola https://github.com/Bokola
2020-12-04

Data table syntax is of the form DT[i, j, by]

Data Manipulation

Just looking

irisDT = as.data.table(iris)
tables() # show loaded tables
     NAME NROW NCOL MB
1: irisDT  150    5  0
                                                        COLS KEY
1: Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species    
Total: 0MB

Sorting / ordering rows

setorder(data.table, …)

-: to sort a variable in descending order

setorder(irisDT, Sepal.Length, Sepal.Width)
irisDT
     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
  1:          4.3         3.0          1.1         0.1    setosa
  2:          4.4         2.9          1.4         0.2    setosa
  3:          4.4         3.0          1.3         0.2    setosa
  4:          4.4         3.2          1.3         0.2    setosa
  5:          4.5         2.3          1.3         0.3    setosa
 ---                                                            
146:          7.7         2.6          6.9         2.3 virginica
147:          7.7         2.8          6.7         2.0 virginica
148:          7.7         3.0          6.1         2.3 virginica
149:          7.7         3.8          6.7         2.2 virginica
150:          7.9         3.8          6.4         2.0 virginica
setorder(irisDT, -Species, Sepal.Width)
irisDT
     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
  1:          6.0         2.2          5.0         1.5 virginica
  2:          4.9         2.5          4.5         1.7 virginica
  3:          5.7         2.5          5.0         2.0 virginica
  4:          6.3         2.5          5.0         1.9 virginica
  5:          6.7         2.5          5.8         1.8 virginica
 ---                                                            
146:          5.4         3.9          1.3         0.4    setosa
147:          5.8         4.0          1.2         0.2    setosa
148:          5.2         4.1          1.5         0.1    setosa
149:          5.5         4.2          1.4         0.2    setosa
150:          5.7         4.4          1.5         0.4    setosa

Removing information

Selecting rows

DT[i, j, by]

irisDT[3:4]
   Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
1:          5.7         2.5            5         2.0 virginica
2:          6.3         2.5            5         1.9 virginica
irisDT[3:4,] #row 3 and 4
   Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
1:          5.7         2.5            5         2.0 virginica
2:          6.3         2.5            5         1.9 virginica
irisDT[-(1:5)] #delete rows 1:5
     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
  1:          6.1         2.6          5.6         1.4 virginica
  2:          7.7         2.6          6.9         2.3 virginica
  3:          5.8         2.7          5.1         1.9 virginica
  4:          5.8         2.7          5.1         1.9 virginica
  5:          6.3         2.7          4.9         1.8 virginica
 ---                                                            
141:          5.4         3.9          1.3         0.4    setosa
142:          5.8         4.0          1.2         0.2    setosa
143:          5.2         4.1          1.5         0.1    setosa
144:          5.5         4.2          1.4         0.2    setosa
145:          5.7         4.4          1.5         0.4    setosa
irisDT[!(1:5)] # using false to delete as well
     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
  1:          6.1         2.6          5.6         1.4 virginica
  2:          7.7         2.6          6.9         2.3 virginica
  3:          5.8         2.7          5.1         1.9 virginica
  4:          5.8         2.7          5.1         1.9 virginica
  5:          6.3         2.7          4.9         1.8 virginica
 ---                                                            
141:          5.4         3.9          1.3         0.4    setosa
142:          5.8         4.0          1.2         0.2    setosa
143:          5.2         4.1          1.5         0.1    setosa
144:          5.5         4.2          1.4         0.2    setosa
145:          5.7         4.4          1.5         0.4    setosa
irisDT[.N] # last row
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1:          5.7         4.4          1.5         0.4  setosa
irisDT[1:(.N-10)] # all but the last 10 rows
     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
  1:          6.0         2.2          5.0         1.5 virginica
  2:          4.9         2.5          4.5         1.7 virginica
  3:          5.7         2.5          5.0         2.0 virginica
  4:          6.3         2.5          5.0         1.9 virginica
  5:          6.7         2.5          5.8         1.8 virginica
 ---                                                            
136:          4.9         3.6          1.4         0.1    setosa
137:          5.0         3.6          1.4         0.2    setosa
138:          5.1         3.7          1.5         0.4    setosa
139:          5.3         3.7          1.5         0.2    setosa
140:          5.4         3.7          1.5         0.2    setosa
irisDT[Species == "virginica"] # based on conditions fulfilled in columns
    Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
 1:          6.0         2.2          5.0         1.5 virginica
 2:          4.9         2.5          4.5         1.7 virginica
 3:          5.7         2.5          5.0         2.0 virginica
 4:          6.3         2.5          5.0         1.9 virginica
 5:          6.7         2.5          5.8         1.8 virginica
 6:          6.1         2.6          5.6         1.4 virginica
 7:          7.7         2.6          6.9         2.3 virginica
 8:          5.8         2.7          5.1         1.9 virginica
 9:          5.8         2.7          5.1         1.9 virginica
10:          6.3         2.7          4.9         1.8 virginica
11:          6.4         2.7          5.3         1.9 virginica
12:          5.6         2.8          4.9         2.0 virginica
13:          5.8         2.8          5.1         2.4 virginica
14:          6.2         2.8          4.8         1.8 virginica
15:          6.3         2.8          5.1         1.5 virginica
16:          6.4         2.8          5.6         2.1 virginica
17:          6.4         2.8          5.6         2.2 virginica
18:          7.4         2.8          6.1         1.9 virginica
19:          7.7         2.8          6.7         2.0 virginica
20:          6.3         2.9          5.6         1.8 virginica
21:          7.3         2.9          6.3         1.8 virginica
22:          5.9         3.0          5.1         1.8 virginica
23:          6.0         3.0          4.8         1.8 virginica
24:          6.1         3.0          4.9         1.8 virginica
25:          6.5         3.0          5.8         2.2 virginica
26:          6.5         3.0          5.5         1.8 virginica
27:          6.5         3.0          5.2         2.0 virginica
28:          6.7         3.0          5.2         2.3 virginica
29:          6.8         3.0          5.5         2.1 virginica
30:          7.1         3.0          5.9         2.1 virginica
31:          7.2         3.0          5.8         1.6 virginica
32:          7.6         3.0          6.6         2.1 virginica
33:          7.7         3.0          6.1         2.3 virginica
34:          6.4         3.1          5.5         1.8 virginica
35:          6.7         3.1          5.6         2.4 virginica
36:          6.9         3.1          5.4         2.1 virginica
37:          6.9         3.1          5.1         2.3 virginica
38:          6.4         3.2          5.3         2.3 virginica
39:          6.5         3.2          5.1         2.0 virginica
40:          6.8         3.2          5.9         2.3 virginica
41:          6.9         3.2          5.7         2.3 virginica
42:          7.2         3.2          6.0         1.8 virginica
43:          6.3         3.3          6.0         2.5 virginica
44:          6.7         3.3          5.7         2.1 virginica
45:          6.7         3.3          5.7         2.5 virginica
46:          6.2         3.4          5.4         2.3 virginica
47:          6.3         3.4          5.6         2.4 virginica
48:          7.2         3.6          6.1         2.5 virginica
49:          7.7         3.8          6.7         2.2 virginica
50:          7.9         3.8          6.4         2.0 virginica
    Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
irisDT[Species %like% "^v"] # using %like% helper function
     Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
  1:          6.0         2.2          5.0         1.5  virginica
  2:          4.9         2.5          4.5         1.7  virginica
  3:          5.7         2.5          5.0         2.0  virginica
  4:          6.3         2.5          5.0         1.9  virginica
  5:          6.7         2.5          5.8         1.8  virginica
  6:          6.1         2.6          5.6         1.4  virginica
  7:          7.7         2.6          6.9         2.3  virginica
  8:          5.8         2.7          5.1         1.9  virginica
  9:          5.8         2.7          5.1         1.9  virginica
 10:          6.3         2.7          4.9         1.8  virginica
 11:          6.4         2.7          5.3         1.9  virginica
 12:          5.6         2.8          4.9         2.0  virginica
 13:          5.8         2.8          5.1         2.4  virginica
 14:          6.2         2.8          4.8         1.8  virginica
 15:          6.3         2.8          5.1         1.5  virginica
 16:          6.4         2.8          5.6         2.1  virginica
 17:          6.4         2.8          5.6         2.2  virginica
 18:          7.4         2.8          6.1         1.9  virginica
 19:          7.7         2.8          6.7         2.0  virginica
 20:          6.3         2.9          5.6         1.8  virginica
 21:          7.3         2.9          6.3         1.8  virginica
 22:          5.9         3.0          5.1         1.8  virginica
 23:          6.0         3.0          4.8         1.8  virginica
 24:          6.1         3.0          4.9         1.8  virginica
 25:          6.5         3.0          5.8         2.2  virginica
 26:          6.5         3.0          5.5         1.8  virginica
 27:          6.5         3.0          5.2         2.0  virginica
 28:          6.7         3.0          5.2         2.3  virginica
 29:          6.8         3.0          5.5         2.1  virginica
 30:          7.1         3.0          5.9         2.1  virginica
 31:          7.2         3.0          5.8         1.6  virginica
 32:          7.6         3.0          6.6         2.1  virginica
 33:          7.7         3.0          6.1         2.3  virginica
 34:          6.4         3.1          5.5         1.8  virginica
 35:          6.7         3.1          5.6         2.4  virginica
 36:          6.9         3.1          5.4         2.1  virginica
 37:          6.9         3.1          5.1         2.3  virginica
 38:          6.4         3.2          5.3         2.3  virginica
 39:          6.5         3.2          5.1         2.0  virginica
 40:          6.8         3.2          5.9         2.3  virginica
 41:          6.9         3.2          5.7         2.3  virginica
 42:          7.2         3.2          6.0         1.8  virginica
 43:          6.3         3.3          6.0         2.5  virginica
 44:          6.7         3.3          5.7         2.1  virginica
 45:          6.7         3.3          5.7         2.5  virginica
 46:          6.2         3.4          5.4         2.3  virginica
 47:          6.3         3.4          5.6         2.4  virginica
 48:          7.2         3.6          6.1         2.5  virginica
 49:          7.7         3.8          6.7         2.2  virginica
 50:          7.9         3.8          6.4         2.0  virginica
 51:          5.0         2.0          3.5         1.0 versicolor
 52:          6.0         2.2          4.0         1.0 versicolor
 53:          6.2         2.2          4.5         1.5 versicolor
 54:          5.0         2.3          3.3         1.0 versicolor
 55:          5.5         2.3          4.0         1.3 versicolor
 56:          6.3         2.3          4.4         1.3 versicolor
 57:          4.9         2.4          3.3         1.0 versicolor
 58:          5.5         2.4          3.8         1.1 versicolor
 59:          5.5         2.4          3.7         1.0 versicolor
 60:          5.1         2.5          3.0         1.1 versicolor
 61:          5.5         2.5          4.0         1.3 versicolor
 62:          5.6         2.5          3.9         1.1 versicolor
 63:          6.3         2.5          4.9         1.5 versicolor
 64:          5.5         2.6          4.4         1.2 versicolor
 65:          5.7         2.6          3.5         1.0 versicolor
 66:          5.8         2.6          4.0         1.2 versicolor
 67:          5.2         2.7          3.9         1.4 versicolor
 68:          5.6         2.7          4.2         1.3 versicolor
 69:          5.8         2.7          4.1         1.0 versicolor
 70:          5.8         2.7          3.9         1.2 versicolor
 71:          6.0         2.7          5.1         1.6 versicolor
 72:          5.7         2.8          4.5         1.3 versicolor
 73:          5.7         2.8          4.1         1.3 versicolor
 74:          6.1         2.8          4.0         1.3 versicolor
 75:          6.1         2.8          4.7         1.2 versicolor
 76:          6.5         2.8          4.6         1.5 versicolor
 77:          6.8         2.8          4.8         1.4 versicolor
 78:          5.6         2.9          3.6         1.3 versicolor
 79:          5.7         2.9          4.2         1.3 versicolor
 80:          6.0         2.9          4.5         1.5 versicolor
 81:          6.1         2.9          4.7         1.4 versicolor
 82:          6.2         2.9          4.3         1.3 versicolor
 83:          6.4         2.9          4.3         1.3 versicolor
 84:          6.6         2.9          4.6         1.3 versicolor
 85:          5.4         3.0          4.5         1.5 versicolor
 86:          5.6         3.0          4.5         1.5 versicolor
 87:          5.6         3.0          4.1         1.3 versicolor
 88:          5.7         3.0          4.2         1.2 versicolor
 89:          5.9         3.0          4.2         1.5 versicolor
 90:          6.1         3.0          4.6         1.4 versicolor
 91:          6.6         3.0          4.4         1.4 versicolor
 92:          6.7         3.0          5.0         1.7 versicolor
 93:          6.7         3.1          4.4         1.4 versicolor
 94:          6.7         3.1          4.7         1.5 versicolor
 95:          6.9         3.1          4.9         1.5 versicolor
 96:          5.9         3.2          4.8         1.8 versicolor
 97:          6.4         3.2          4.5         1.5 versicolor
 98:          7.0         3.2          4.7         1.4 versicolor
 99:          6.3         3.3          4.7         1.6 versicolor
100:          6.0         3.4          4.5         1.6 versicolor
     Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
irisDT[Petal.Width %between% c(0.3, 0.4)] # values in an interval
    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
 1:          4.5         2.3          1.3         0.3  setosa
 2:          4.8         3.0          1.4         0.3  setosa
 3:          4.6         3.4          1.4         0.3  setosa
 4:          5.0         3.4          1.6         0.4  setosa
 5:          5.4         3.4          1.5         0.4  setosa
 6:          5.0         3.5          1.3         0.3  setosa
 7:          5.1         3.5          1.4         0.3  setosa
 8:          5.1         3.7          1.5         0.4  setosa
 9:          5.1         3.8          1.5         0.3  setosa
10:          5.1         3.8          1.9         0.4  setosa
11:          5.7         3.8          1.7         0.3  setosa
12:          5.4         3.9          1.7         0.4  setosa
13:          5.4         3.9          1.3         0.4  setosa
14:          5.7         4.4          1.5         0.4  setosa

Selecting columns

DT[i, j, by]

irisDT[, Species] # returns a vector
  [1] virginica  virginica  virginica  virginica  virginica 
  [6] virginica  virginica  virginica  virginica  virginica 
 [11] virginica  virginica  virginica  virginica  virginica 
 [16] virginica  virginica  virginica  virginica  virginica 
 [21] virginica  virginica  virginica  virginica  virginica 
 [26] virginica  virginica  virginica  virginica  virginica 
 [31] virginica  virginica  virginica  virginica  virginica 
 [36] virginica  virginica  virginica  virginica  virginica 
 [41] virginica  virginica  virginica  virginica  virginica 
 [46] virginica  virginica  virginica  virginica  virginica 
 [51] versicolor versicolor versicolor versicolor versicolor
 [56] versicolor versicolor versicolor versicolor versicolor
 [61] versicolor versicolor versicolor versicolor versicolor
 [66] versicolor versicolor versicolor versicolor versicolor
 [71] versicolor versicolor versicolor versicolor versicolor
 [76] versicolor versicolor versicolor versicolor versicolor
 [81] versicolor versicolor versicolor versicolor versicolor
 [86] versicolor versicolor versicolor versicolor versicolor
 [91] versicolor versicolor versicolor versicolor versicolor
 [96] versicolor versicolor versicolor versicolor versicolor
[101] setosa     setosa     setosa     setosa     setosa    
[106] setosa     setosa     setosa     setosa     setosa    
[111] setosa     setosa     setosa     setosa     setosa    
[116] setosa     setosa     setosa     setosa     setosa    
[121] setosa     setosa     setosa     setosa     setosa    
[126] setosa     setosa     setosa     setosa     setosa    
[131] setosa     setosa     setosa     setosa     setosa    
[136] setosa     setosa     setosa     setosa     setosa    
[141] setosa     setosa     setosa     setosa     setosa    
[146] setosa     setosa     setosa     setosa     setosa    
Levels: setosa versicolor virginica
irisDT[, "Species"] # returns a dataframe/data.table
       Species
  1: virginica
  2: virginica
  3: virginica
  4: virginica
  5: virginica
 ---          
146:    setosa
147:    setosa
148:    setosa
149:    setosa
150:    setosa
irisDT[, -c("Species")]
     Sepal.Length Sepal.Width Petal.Length Petal.Width
  1:          6.0         2.2          5.0         1.5
  2:          4.9         2.5          4.5         1.7
  3:          5.7         2.5          5.0         2.0
  4:          6.3         2.5          5.0         1.9
  5:          6.7         2.5          5.8         1.8
 ---                                                  
146:          5.4         3.9          1.3         0.4
147:          5.8         4.0          1.2         0.2
148:          5.2         4.1          1.5         0.1
149:          5.5         4.2          1.4         0.2
150:          5.7         4.4          1.5         0.4
irisDT[, !c("Species")]
     Sepal.Length Sepal.Width Petal.Length Petal.Width
  1:          6.0         2.2          5.0         1.5
  2:          4.9         2.5          4.5         1.7
  3:          5.7         2.5          5.0         2.0
  4:          6.3         2.5          5.0         1.9
  5:          6.7         2.5          5.8         1.8
 ---                                                  
146:          5.4         3.9          1.3         0.4
147:          5.8         4.0          1.2         0.2
148:          5.2         4.1          1.5         0.1
149:          5.5         4.2          1.4         0.2
150:          5.7         4.4          1.5         0.4
irisDT[, list(Species, LS = Sepal.Length)] # select and rename
       Species  LS
  1: virginica 6.0
  2: virginica 4.9
  3: virginica 5.7
  4: virginica 6.3
  5: virginica 6.7
 ---              
146:    setosa 5.4
147:    setosa 5.8
148:    setosa 5.2
149:    setosa 5.5
150:    setosa 5.7
# .() is an alias for list

irisDT[, .(Species, SL = Sepal.Length)]
       Species  SL
  1: virginica 6.0
  2: virginica 4.9
  3: virginica 5.7
  4: virginica 6.3
  5: virginica 6.7
 ---              
146:    setosa 5.4
147:    setosa 5.8
148:    setosa 5.2
149:    setosa 5.5
150:    setosa 5.7
irisDT[Species == "virginica" & Sepal.Length > 7 & Sepal.Width < 3, !c("Species")]
   Sepal.Length Sepal.Width Petal.Length Petal.Width
1:          7.7         2.6          6.9         2.3
2:          7.4         2.8          6.1         1.9
3:          7.7         2.8          6.7         2.0
4:          7.3         2.9          6.3         1.8
irisDT[Species == "virginica" &
Sepal.Length > 7 &Sepal.Width < 3,.(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width)]
   Sepal.Length Sepal.Width Petal.Length Petal.Width
1:          7.7         2.6          6.9         2.3
2:          7.4         2.8          6.1         1.9
3:          7.7         2.8          6.7         2.0
4:          7.3         2.9          6.3         1.8

Adding information

Making new columns - preserving existing ones

DT[i, j, by]

data.table uses a new operator := to add/update/delete columns (by reference)

  1. option 1
irisDT[, maxLength := max(Sepal.Length, Petal.Length)][,
          minWidth := min(Sepal.Length, Petal.Width)]
irisDT
     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
  1:          6.0         2.2          5.0         1.5 virginica
  2:          4.9         2.5          4.5         1.7 virginica
  3:          5.7         2.5          5.0         2.0 virginica
  4:          6.3         2.5          5.0         1.9 virginica
  5:          6.7         2.5          5.8         1.8 virginica
 ---                                                            
146:          5.4         3.9          1.3         0.4    setosa
147:          5.8         4.0          1.2         0.2    setosa
148:          5.2         4.1          1.5         0.1    setosa
149:          5.5         4.2          1.4         0.2    setosa
150:          5.7         4.4          1.5         0.4    setosa
     maxLength minWidth
  1:       7.9      0.1
  2:       7.9      0.1
  3:       7.9      0.1
  4:       7.9      0.1
  5:       7.9      0.1
 ---                   
146:       7.9      0.1
147:       7.9      0.1
148:       7.9      0.1
149:       7.9      0.1
150:       7.9      0.1
  1. option 2

LHS := RHS form

irisDT[, c("maxLength", "minWidth") :=
         list(
           max(Sepal.Length, Petal.Length),
           min(Sepal.Width, Petal.Width)
         )]

irisDT
     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
  1:          6.0         2.2          5.0         1.5 virginica
  2:          4.9         2.5          4.5         1.7 virginica
  3:          5.7         2.5          5.0         2.0 virginica
  4:          6.3         2.5          5.0         1.9 virginica
  5:          6.7         2.5          5.8         1.8 virginica
 ---                                                            
146:          5.4         3.9          1.3         0.4    setosa
147:          5.8         4.0          1.2         0.2    setosa
148:          5.2         4.1          1.5         0.1    setosa
149:          5.5         4.2          1.4         0.2    setosa
150:          5.7         4.4          1.5         0.4    setosa
     maxLength minWidth
  1:       7.9      0.1
  2:       7.9      0.1
  3:       7.9      0.1
  4:       7.9      0.1
  5:       7.9      0.1
 ---                   
146:       7.9      0.1
147:       7.9      0.1
148:       7.9      0.1
149:       7.9      0.1
150:       7.9      0.1
  1. Option 3

Functional form

irisDT[, `:=` (
  maxLength = max(Sepal.Length, Petal.Length),
  minWidth = min(Sepal.Width, Petal.Width)
)]
irisDT
     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
  1:          6.0         2.2          5.0         1.5 virginica
  2:          4.9         2.5          4.5         1.7 virginica
  3:          5.7         2.5          5.0         2.0 virginica
  4:          6.3         2.5          5.0         1.9 virginica
  5:          6.7         2.5          5.8         1.8 virginica
 ---                                                            
146:          5.4         3.9          1.3         0.4    setosa
147:          5.8         4.0          1.2         0.2    setosa
148:          5.2         4.1          1.5         0.1    setosa
149:          5.5         4.2          1.4         0.2    setosa
150:          5.7         4.4          1.5         0.4    setosa
     maxLength minWidth
  1:       7.9      0.1
  2:       7.9      0.1
  3:       7.9      0.1
  4:       7.9      0.1
  5:       7.9      0.1
 ---                   
146:       7.9      0.1
147:       7.9      0.1
148:       7.9      0.1
149:       7.9      0.1
150:       7.9      0.1

Making new columns - dropping existing ones

DT[i, j, by]

irisDT[, .(maxLength = pmax(Sepal.Length, Petal.Length),
minWidth = pmin(Sepal.Width, Petal.Width))]
     maxLength minWidth
  1:       6.0      1.5
  2:       4.9      1.7
  3:       5.7      2.0
  4:       6.3      1.9
  5:       6.7      1.8
 ---                   
146:       5.4      0.4
147:       5.8      0.2
148:       5.2      0.1
149:       5.5      0.2
150:       5.7      0.4
newIris <- irisDT[, .(Sepal.Area = Sepal.Width * Sepal.Length,
Petal.Area = Petal.Width * Petal.Length)]
newIris[, Area.Ratio := Petal.Area / Sepal.Area]
newIris
     Sepal.Area Petal.Area  Area.Ratio
  1:      13.20       7.50 0.568181818
  2:      12.25       7.65 0.624489796
  3:      14.25      10.00 0.701754386
  4:      15.75       9.50 0.603174603
  5:      16.75      10.44 0.623283582
 ---                                  
146:      21.06       0.52 0.024691358
147:      23.20       0.24 0.010344828
148:      21.32       0.15 0.007035647
149:      23.10       0.28 0.012121212
150:      25.08       0.60 0.023923445

Reducing information

Summarizing rows

irisDT[, .(meanSepalLength = mean(Sepal.Length),
           meanSepalWidth = mean(Sepal.Width),
           meanPetalLength = mean(Petal.Length),
           meanPetalWidth = mean(Petal.Width))]
   meanSepalLength meanSepalWidth meanPetalLength meanPetalWidth
1:        5.843333       3.057333           3.758       1.199333
irisDT[, .(nSamples = .N, nSpecies = uniqueN(Species))
       ]
   nSamples nSpecies
1:      150        3

Summarizing rows with filtering

irisDT[Species == "versicolor",.(
  meanSepalLength = mean(Sepal.Length),
           meanSepalWidth = mean(Sepal.Width),
           meanPetalLength = mean(Petal.Length),
           meanPetalWidth = mean(Petal.Width)
)]
   meanSepalLength meanSepalWidth meanPetalLength meanPetalWidth
1:           5.936           2.77            4.26          1.326

Grouping by one or more variables

irisDT[, .(meanSepalLength = mean(Sepal.Length),
meanSepalWidth = mean(Sepal.Width),
meanPetalLength = mean(Petal.Length),
meanPetalWidth = mean(Petal.Width)), by = "Species"]
      Species meanSepalLength meanSepalWidth meanPetalLength
1:  virginica           6.588          2.974           5.552
2: versicolor           5.936          2.770           4.260
3:     setosa           5.006          3.428           1.462
   meanPetalWidth
1:          2.026
2:          1.326
3:          0.246
irisDT[, .(nSamples = .N, nSpecies = uniqueN(Species)), by = .(Species)]
      Species nSamples nSpecies
1:  virginica       50        1
2: versicolor       50        1
3:     setosa       50        1

Combining information

joining 2 data.tables

Inner join: return all rows from x where there are matching values in y and all columns from x and y. In case of multiple matches between x and y, all combinations of the matches are returned.

Full join: return all rows and columns from both x and y.

Left join: return all rows from x and all columns from x and y. In case of multiple matches between x and y, all combinations of the matches are returned.

Right join: return all rows from y and all columns from x and y. In case of multiple matches between x and y, all combinations of the matches are returned.

DT[i, on]

(x <- data.table(id = c(1, 2, 4, 5, 6),
x = c(9, 12, 14, 21, 8)))
   id  x
1:  1  9
2:  2 12
3:  4 14
4:  5 21
5:  6  8
(y <- data.table(id = c(1, 3, 4, 6, 6),
y = c(8, 14, 19, 2, 4)))
   id  y
1:  1  8
2:  3 14
3:  4 19
4:  6  2
5:  6  4

inner join - data.table

y[x, on = .(id), nomatch = 0]
   id  y  x
1:  1  8  9
2:  4 19 14
3:  6  2  8
4:  6  4  8

full join - merge

merge.data.table(x = x, y = y, by = "id", all = TRUE)
   id  x  y
1:  1  9  8
2:  2 12 NA
3:  3 NA 14
4:  4 14 19
5:  5 21 NA
6:  6  8  2
7:  6  8  4

left join - merge

merge.data.table(x = x, y = y, by = "id", all.x = TRUE)
   id  x  y
1:  1  9  8
2:  2 12 NA
3:  4 14 19
4:  5 21 NA
5:  6  8  2
6:  6  8  4

left join - data.table

y[x, on = .(id)]
   id  y  x
1:  1  8  9
2:  2 NA 12
3:  4 19 14
4:  5 NA 21
5:  6  2  8
6:  6  4  8

right join - merge

data.table::merge.data.table(x = x, y = y, by = "id", all.y = T)
   id  x  y
1:  1  9  8
2:  3 NA 14
3:  4 14 19
4:  6  8  2
5:  6  8  4

right jpin - data.table

x[y, on = .(id)]
   id  x  y
1:  1  9  8
2:  3 NA 14
3:  4 14 19
4:  6  8  2
5:  6  8  4

antijoin - data.table

x[!y, on = .(id)]
   id  x
1:  2 12
2:  5 21

Keys

No need of the on argument when performing a join

setkey(x, id)
setkey(y, "id")
haskey(x); haskey(y)
[1] TRUE
[1] TRUE
key(x); key(y)
[1] "id"
[1] "id"

joins with keys already specified

# Inner join x and y
x[y, nomatch = 0]
   id  x  y
1:  1  9  8
2:  4 14 19
3:  6  8  2
4:  6  8  4
# Right join x and y
x[y]
   id  x  y
1:  1  9  8
2:  3 NA 14
3:  4 14 19
4:  6  8  2
5:  6  8  4
# Left join x and y
y[x]
   id  y  x
1:  1  8  9
2:  2 NA 12
3:  4 19 14
4:  5 NA 21
5:  6  2  8
6:  6  4  8

Exercise 3

library(dplyr)
library(data.table)
data(starwars)
starwarsDT <- as.data.table(starwars)
starwarsDT[1]
             name height mass hair_color skin_color eye_color
1: Luke Skywalker    172   77      blond       fair      blue
   birth_year  sex    gender homeworld species
1:         19 male masculine  Tatooine   Human
                                                                                         films
1: The Empire Strikes Back,Revenge of the Sith,Return of the Jedi,A New Hope,The Force Awakens
                            vehicles               starships
1: Snowspeeder,Imperial Speeder Bike X-wing,Imperial shuttle
starwarsDT[, .(min = min(mass, na.rm = T),
               max = max(mass, na.rm = T),
               nobs = .N), by = homeworld][nobs >1] 
    homeworld   min   max nobs
 1:  Tatooine  32.0 136.0   10
 2:     Naboo  32.0  85.0   11
 3:  Alderaan  49.0  79.0    3
 4:  Kashyyyk 112.0 136.0    2
 5:  Corellia  77.0  80.0    2
 6:      <NA>  17.0 140.0   10
 7:    Kamino  78.2  88.0    3
 8: Coruscant  50.0  50.0    3
 9:    Ryloth  55.0  55.0    2
10:    Mirial  50.0  56.2    2

Reshaping data with data.table

Wide format

knitr::kable(head(iris[1:4,], n =3))
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
5.1 3.5 1.4 0.2 setosa
4.9 3.0 1.4 0.2 setosa
4.7 3.2 1.3 0.2 setosa

melt(wide_df, id.vars = 'grouping variable', variable.name = 'measures_across_grouping_var',value.name = 'value_name_as_desired')

Long to wide

dcast(long_df, group ~ atttribute, value.var = "value")

Separating and uniting information

irisUnited <- irisDT[ , Sepal.Length.Width := paste0(Sepal.Length, "/", Sepal.Width)]
irisUnited[1:5]
   Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
1:          6.0         2.2          5.0         1.5 virginica
2:          4.9         2.5          4.5         1.7 virginica
3:          5.7         2.5          5.0         2.0 virginica
4:          6.3         2.5          5.0         1.9 virginica
5:          6.7         2.5          5.8         1.8 virginica
   maxLength minWidth Sepal.Length.Width
1:       7.9      0.1              6/2.2
2:       7.9      0.1            4.9/2.5
3:       7.9      0.1            5.7/2.5
4:       7.9      0.1            6.3/2.5
5:       7.9      0.1            6.7/2.5
irisSeparated = irisUnited[, c('Sepal.Length', 'Sepal.Width') := tstrsplit(Sepal.Length.Width, "/")]
irisSeparated[1:5]
   Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
1:            6         2.2          5.0         1.5 virginica
2:          4.9         2.5          4.5         1.7 virginica
3:          5.7         2.5          5.0         2.0 virginica
4:          6.3         2.5          5.0         1.9 virginica
5:          6.7         2.5          5.8         1.8 virginica
   maxLength minWidth Sepal.Length.Width
1:       7.9      0.1              6/2.2
2:       7.9      0.1            4.9/2.5
3:       7.9      0.1            5.7/2.5
4:       7.9      0.1            6.3/2.5
5:       7.9      0.1            6.7/2.5

Hands-on session

Exercise 1

data("relig_income")
a = as.data.table(relig_income)
a[1:10]
                   religion <$10k $10-20k $20-30k $30-40k $40-50k
 1:                Agnostic    27      34      60      81      76
 2:                 Atheist    12      27      37      52      35
 3:                Buddhist    27      21      30      34      33
 4:                Catholic   418     617     732     670     638
 5:      Don’t know/refused    15      14      15      11      10
 6:        Evangelical Prot   575     869    1064     982     881
 7:                   Hindu     1       9       7       9      11
 8: Historically Black Prot   228     244     236     238     197
 9:       Jehovah's Witness    20      27      24      24      21
10:                  Jewish    19      19      25      25      30
    $50-75k $75-100k $100-150k >150k Don't know/refused
 1:     137      122       109    84                 96
 2:      70       73        59    74                 76
 3:      58       62        39    53                 54
 4:    1116      949       792   633               1489
 5:      35       21        17    18                116
 6:    1486      949       723   414               1529
 7:      34       47        48    54                 37
 8:     223      131        81    78                339
 9:      30       15        11     6                 37
10:      95       69        87   151                162
b = melt(a, id.vars = "religion", variable.name = "income_category", value.name = "n")

Exercise 2

data("storms")
aa = as.data.table(storms)
aa[, date:= paste0(year, "-", month,"-", day)]

Useful symbols/ functions

Summarizing for specific condition

cols = sapply(irisDT, is.numeric)
cols = names(cols)[cols]
irisDT[, lapply(.SD, mean), by = Species, .SDcols = cols]
      Species Petal.Length Petal.Width maxLength minWidth
1:  virginica        5.552       2.026       7.9      0.1
2: versicolor        4.260       1.326       7.9      0.1
3:     setosa        1.462       0.246       7.9      0.1

Summarizing for specific variable of interest

irisDT[, lapply(.SD, mean), by = Species, .SDcols = c("Petal.Length", "Petal.Width")]
      Species Petal.Length Petal.Width
1:  virginica        5.552       2.026
2: versicolor        4.260       1.326
3:     setosa        1.462       0.246

Summarizing all columns

irisDT[ , unlist(recursive = FALSE, lapply(.(mean = mean, sd = sd, min = min, max = max), 
                                           function(f) lapply(.SD, f))), by = Species]
      Species mean.Sepal.Length mean.Sepal.Width mean.Petal.Length
1:  virginica                NA               NA             5.552
2: versicolor                NA               NA             4.260
3:     setosa                NA               NA             1.462
   mean.Petal.Width mean.maxLength mean.minWidth
1:            2.026            7.9           0.1
2:            1.326            7.9           0.1
3:            0.246            7.9           0.1
   mean.Sepal.Length.Width sd.Sepal.Length sd.Sepal.Width
1:                      NA       0.6358796      0.3224966
2:                      NA       0.5161711      0.3137983
3:                      NA       0.3524897      0.3790644
   sd.Petal.Length sd.Petal.Width sd.maxLength sd.minWidth
1:       0.5518947      0.2746501            0           0
2:       0.4699110      0.1977527            0           0
3:       0.1736640      0.1053856            0           0
   sd.Sepal.Length.Width min.Sepal.Length min.Sepal.Width
1:                    NA              4.9             2.2
2:                    NA              4.9               2
3:                    NA              4.3             2.3
   min.Petal.Length min.Petal.Width min.maxLength min.minWidth
1:              4.5             1.4           7.9          0.1
2:              3.0             1.0           7.9          0.1
3:              1.0             0.1           7.9          0.1
   min.Sepal.Length.Width max.Sepal.Length max.Sepal.Width
1:                4.9/2.5              7.9             3.8
2:                4.9/2.4                7             3.4
3:                  4.3/3              5.8             4.4
   max.Petal.Length max.Petal.Width max.maxLength max.minWidth
1:              6.9             2.5           7.9          0.1
2:              5.1             1.8           7.9          0.1
3:              1.9             0.6           7.9          0.1
   max.Sepal.Length.Width
1:                7.9/3.8
2:                  7/3.2
3:                  5/3.6

Hands-on 3

Exercise 1 - Compute the minimum, maximum, median and mean for all numeric columns

data("starwars")
starwarsDT = as.data.table(starwars)
cols = sapply(starwarsDT, is.numeric)
cols = names(cols)[cols]

starwarsDT[, height:= as.numeric(height)]

starwarsDT[, unlist(recursive = F,
                    lapply(.(min = min, max = max, median = median, mean = mean), function(f) lapply(.SD, f))), by = .(homeworld, species), .SDcols = cols]
         homeworld        species min.height min.mass min.birth_year
 1:       Tatooine          Human        163       NA           19.0
 2:       Tatooine          Droid         97     32.0             NA
 3:          Naboo          Droid         96     32.0           33.0
 4:       Alderaan          Human        150       NA             NA
 5:        Stewjon          Human        182     77.0           57.0
 6:         Eriadu          Human        180       NA           64.0
 7:       Kashyyyk        Wookiee        228    112.0             NA
 8:       Corellia          Human        170     77.0           21.0
 9:          Rodia         Rodian        173     74.0           44.0
10:      Nal Hutta           Hutt        175   1358.0          600.0
11:     Bestine IV          Human        180    110.0             NA
12:           <NA> Yoda's species         66     17.0          896.0
13:          Naboo          Human        157       NA             NA
14:         Kamino          Human        183     78.2           31.5
15:           <NA>          Droid         NA       NA             NA
16:      Trandosha     Trandoshan        190    113.0           53.0
17:        Socorro          Human        177     79.0           31.0
18:         Bespin          Human        175     79.0           37.0
19:       Mon Cala   Mon Calamari        180     83.0           41.0
20:      Chandrila          Human        150       NA           48.0
21:           <NA>          Human         NA       NA             NA
22:          Endor           Ewok         88     20.0            8.0
23:        Sullust      Sullustan        160     68.0             NA
24: Cato Neimoidia      Neimodian        191     90.0             NA
25:      Coruscant          Human        167       NA             NA
26:          Naboo         Gungan        196       NA             NA
27:          Naboo           <NA>        183       NA             NA
28:       Toydaria      Toydarian        137       NA             NA
29:      Malastare            Dug        112     40.0             NA
30:       Dathomir         Zabrak        175     80.0           54.0
31:         Ryloth        Twi'lek        178       NA             NA
32:        Vulpter     Vulptereen         94     45.0             NA
33:        Troiken          Xexto        122       NA             NA
34:           Tund          Toong        163     65.0             NA
35:     Haruun Kal          Human        188     84.0           72.0
36:          Cerea         Cerean        198     82.0           92.0
37:    Glee Anselm       Nautolan        196     87.0             NA
38:       Iridonia         Zabrak        171       NA             NA
39:      Coruscant     Tholothian        184     50.0             NA
40:        Iktotch       Iktotchi        188       NA             NA
41:        Quermia       Quermian        264       NA             NA
42:          Dorin        Kel Dor        188     80.0           22.0
43:       Champala       Chagrian        196       NA             NA
44:       Geonosis      Geonosian        183     80.0             NA
45:         Mirial       Mirialan        166     50.0           40.0
46:        Serenno          Human        193     80.0          102.0
47:   Concord Dawn          Human        183     79.0           66.0
48:          Zolan       Clawdite        168     55.0             NA
49:           Ojom       Besalisk        198    102.0             NA
50:         Kamino       Kaminoan        213       NA             NA
51:    Aleen Minor         Aleena         79     15.0             NA
52:          Skako        Skakoan        193     48.0             NA
53:     Muunilinst           Muun        191       NA             NA
54:          Shili        Togruta        178     57.0             NA
55:          Kalee        Kaleesh        216    159.0             NA
56:         Umbara           <NA>        178     48.0             NA
57:         Utapau         Pau'an        206     80.0             NA
58:           <NA>           <NA>         NA       NA             NA
         homeworld        species min.height min.mass min.birth_year
    max.height max.mass max.birth_year median.height median.mass
 1:        202       NA           82.0         180.5          NA
 2:        167     75.0             NA         132.0        53.5
 3:         96     32.0           33.0          96.0        32.0
 4:        191       NA             NA         188.0          NA
 5:        182     77.0           57.0         182.0        77.0
 6:        180       NA           64.0         180.0          NA
 7:        234    136.0             NA         231.0       124.0
 8:        180     80.0           29.0         175.0        78.5
 9:        173     74.0           44.0         173.0        74.0
10:        175   1358.0          600.0         175.0      1358.0
11:        180    110.0             NA         180.0       110.0
12:         66     17.0          896.0          66.0        17.0
13:        185       NA             NA         165.0          NA
14:        183     78.2           31.5         183.0        78.2
15:         NA       NA             NA            NA          NA
16:        190    113.0           53.0         190.0       113.0
17:        177     79.0           31.0         177.0        79.0
18:        175     79.0           37.0         175.0        79.0
19:        180     83.0           41.0         180.0        83.0
20:        150       NA           48.0         150.0          NA
21:         NA       NA             NA            NA          NA
22:         88     20.0            8.0          88.0        20.0
23:        160     68.0             NA         160.0        68.0
24:        191     90.0             NA         191.0        90.0
25:        170       NA             NA         168.5          NA
26:        224       NA             NA         206.0          NA
27:        183       NA             NA         183.0          NA
28:        137       NA             NA         137.0          NA
29:        112     40.0             NA         112.0        40.0
30:        175     80.0           54.0         175.0        80.0
31:        180       NA             NA         179.0          NA
32:         94     45.0             NA          94.0        45.0
33:        122       NA             NA         122.0          NA
34:        163     65.0             NA         163.0        65.0
35:        188     84.0           72.0         188.0        84.0
36:        198     82.0           92.0         198.0        82.0
37:        196     87.0             NA         196.0        87.0
38:        171       NA             NA         171.0          NA
39:        184     50.0             NA         184.0        50.0
40:        188       NA             NA         188.0          NA
41:        264       NA             NA         264.0          NA
42:        188     80.0           22.0         188.0        80.0
43:        196       NA             NA         196.0          NA
44:        183     80.0             NA         183.0        80.0
45:        170     56.2           58.0         168.0        53.1
46:        193     80.0          102.0         193.0        80.0
47:        183     79.0           66.0         183.0        79.0
48:        168     55.0             NA         168.0        55.0
49:        198    102.0             NA         198.0       102.0
50:        229       NA             NA         221.0          NA
51:         79     15.0             NA          79.0        15.0
52:        193     48.0             NA         193.0        48.0
53:        191       NA             NA         191.0          NA
54:        178     57.0             NA         178.0        57.0
55:        216    159.0             NA         216.0       159.0
56:        178     48.0             NA         178.0        48.0
57:        206     80.0             NA         206.0        80.0
58:         NA       NA             NA            NA          NA
    max.height max.mass max.birth_year median.height median.mass
    median.birth_year mean.height mean.mass mean.birth_year
 1:             44.45    179.2500        NA          47.475
 2:                NA    132.0000      53.5              NA
 3:             33.00     96.0000      32.0          33.000
 4:                NA    176.3333        NA              NA
 5:             57.00    182.0000      77.0          57.000
 6:             64.00    180.0000        NA          64.000
 7:                NA    231.0000     124.0              NA
 8:             25.00    175.0000      78.5          25.000
 9:             44.00    173.0000      74.0          44.000
10:            600.00    175.0000    1358.0         600.000
11:                NA    180.0000     110.0              NA
12:            896.00     66.0000      17.0         896.000
13:                NA    168.4000        NA              NA
14:             31.50    183.0000      78.2          31.500
15:                NA          NA        NA              NA
16:             53.00    190.0000     113.0          53.000
17:             31.00    177.0000      79.0          31.000
18:             37.00    175.0000      79.0          37.000
19:             41.00    180.0000      83.0          41.000
20:             48.00    150.0000        NA          48.000
21:                NA          NA        NA              NA
22:              8.00     88.0000      20.0           8.000
23:                NA    160.0000      68.0              NA
24:                NA    191.0000      90.0              NA
25:                NA    168.5000        NA              NA
26:                NA    208.6667        NA              NA
27:                NA    183.0000        NA              NA
28:                NA    137.0000        NA              NA
29:                NA    112.0000      40.0              NA
30:             54.00    175.0000      80.0          54.000
31:                NA    179.0000        NA              NA
32:                NA     94.0000      45.0              NA
33:                NA    122.0000        NA              NA
34:                NA    163.0000      65.0              NA
35:             72.00    188.0000      84.0          72.000
36:             92.00    198.0000      82.0          92.000
37:                NA    196.0000      87.0              NA
38:                NA    171.0000        NA              NA
39:                NA    184.0000      50.0              NA
40:                NA    188.0000        NA              NA
41:                NA    264.0000        NA              NA
42:             22.00    188.0000      80.0          22.000
43:                NA    196.0000        NA              NA
44:                NA    183.0000      80.0              NA
45:             49.00    168.0000      53.1          49.000
46:            102.00    193.0000      80.0         102.000
47:             66.00    183.0000      79.0          66.000
48:                NA    168.0000      55.0              NA
49:                NA    198.0000     102.0              NA
50:                NA    221.0000        NA              NA
51:                NA     79.0000      15.0              NA
52:                NA    193.0000      48.0              NA
53:                NA    191.0000        NA              NA
54:                NA    178.0000      57.0              NA
55:                NA    216.0000     159.0              NA
56:                NA    178.0000      48.0              NA
57:                NA    206.0000      80.0              NA
58:                NA          NA        NA              NA
    median.birth_year mean.height mean.mass mean.birth_year

Exercise 2 - all colnames to upper case

starwarsDT = as.data.table(starwars)
setnames(starwarsDT, toupper(names(starwarsDT)))

Convert all character columns to upper case

cols = sapply(starwarsDT, is.character)
cols = names(cols)[cols]
starwarsDT[, lapply(.SD, toupper)]
                     NAME HEIGHT MASS    HAIR_COLOR
 1:        LUKE SKYWALKER    172   77         BLOND
 2:                 C-3PO    167   75          <NA>
 3:                 R2-D2     96   32          <NA>
 4:           DARTH VADER    202  136          NONE
 5:           LEIA ORGANA    150   49         BROWN
 6:             OWEN LARS    178  120   BROWN, GREY
 7:    BERU WHITESUN LARS    165   75         BROWN
 8:                 R5-D4     97   32          <NA>
 9:     BIGGS DARKLIGHTER    183   84         BLACK
10:        OBI-WAN KENOBI    182   77 AUBURN, WHITE
11:      ANAKIN SKYWALKER    188   84         BLOND
12:        WILHUFF TARKIN    180 <NA>  AUBURN, GREY
13:             CHEWBACCA    228  112         BROWN
14:              HAN SOLO    180   80         BROWN
15:                GREEDO    173   74          <NA>
16: JABBA DESILIJIC TIURE    175 1358          <NA>
17:        WEDGE ANTILLES    170   77         BROWN
18:      JEK TONO PORKINS    180  110         BROWN
19:                  YODA     66   17         WHITE
20:             PALPATINE    170   75          GREY
21:             BOBA FETT    183 78.2         BLACK
22:                 IG-88    200  140          NONE
23:                 BOSSK    190  113          NONE
24:      LANDO CALRISSIAN    177   79         BLACK
25:                 LOBOT    175   79          NONE
26:                ACKBAR    180   83          NONE
27:            MON MOTHMA    150 <NA>        AUBURN
28:          ARVEL CRYNYD   <NA> <NA>         BROWN
29: WICKET SYSTRI WARRICK     88   20         BROWN
30:             NIEN NUNB    160   68          NONE
31:          QUI-GON JINN    193   89         BROWN
32:           NUTE GUNRAY    191   90          NONE
33:         FINIS VALORUM    170 <NA>         BLOND
34:         JAR JAR BINKS    196   66          NONE
35:          ROOS TARPALS    224   82          NONE
36:            RUGOR NASS    206 <NA>          NONE
37:              RIC OLIÉ    183 <NA>         BROWN
38:                 WATTO    137 <NA>         BLACK
39:               SEBULBA    112   40          NONE
40:         QUARSH PANAKA    183 <NA>         BLACK
41:        SHMI SKYWALKER    163 <NA>         BLACK
42:            DARTH MAUL    175   80          NONE
43:           BIB FORTUNA    180 <NA>          NONE
44:           AYLA SECURA    178   55          NONE
45:              DUD BOLT     94   45          NONE
46:               GASGANO    122 <NA>          NONE
47:        BEN QUADINAROS    163   65          NONE
48:            MACE WINDU    188   84          NONE
49:          KI-ADI-MUNDI    198   82         WHITE
50:             KIT FISTO    196   87          NONE
51:             EETH KOTH    171 <NA>         BLACK
52:            ADI GALLIA    184   50          NONE
53:           SAESEE TIIN    188 <NA>          NONE
54:           YARAEL POOF    264 <NA>          NONE
55:              PLO KOON    188   80          NONE
56:            MAS AMEDDA    196 <NA>          NONE
57:          GREGAR TYPHO    185   85         BLACK
58:                 CORDÉ    157 <NA>         BROWN
59:           CLIEGG LARS    183 <NA>         BROWN
60:     POGGLE THE LESSER    183   80          NONE
61:       LUMINARA UNDULI    170 56.2         BLACK
62:         BARRISS OFFEE    166   50         BLACK
63:                 DORMÉ    165 <NA>         BROWN
64:                 DOOKU    193   80         WHITE
65:   BAIL PRESTOR ORGANA    191 <NA>         BLACK
66:            JANGO FETT    183   79         BLACK
67:            ZAM WESELL    168   55        BLONDE
68:       DEXTER JETTSTER    198  102          NONE
69:               LAMA SU    229   88          NONE
70:               TAUN WE    213 <NA>          NONE
71:            JOCASTA NU    167 <NA>         WHITE
72:         RATTS TYERELL     79   15          NONE
73:                R4-P17     96 <NA>          NONE
74:            WAT TAMBOR    193   48          NONE
75:              SAN HILL    191 <NA>          NONE
76:              SHAAK TI    178   57          NONE
77:              GRIEVOUS    216  159          NONE
78:               TARFFUL    234  136         BROWN
79:       RAYMUS ANTILLES    188   79         BROWN
80:             SLY MOORE    178   48          NONE
81:            TION MEDON    206   80          NONE
82:                  FINN   <NA> <NA>         BLACK
83:                   REY   <NA> <NA>         BROWN
84:           POE DAMERON   <NA> <NA>         BROWN
85:                   BB8   <NA> <NA>          NONE
86:        CAPTAIN PHASMA   <NA> <NA>       UNKNOWN
87:         PADMÉ AMIDALA    165   45         BROWN
                     NAME HEIGHT MASS    HAIR_COLOR
             SKIN_COLOR     EYE_COLOR BIRTH_YEAR            SEX
 1:                FAIR          BLUE         19           MALE
 2:                GOLD        YELLOW        112           NONE
 3:         WHITE, BLUE           RED         33           NONE
 4:               WHITE        YELLOW       41.9           MALE
 5:               LIGHT         BROWN         19         FEMALE
 6:               LIGHT          BLUE         52           MALE
 7:               LIGHT          BLUE         47         FEMALE
 8:          WHITE, RED           RED       <NA>           NONE
 9:               LIGHT         BROWN         24           MALE
10:                FAIR     BLUE-GRAY         57           MALE
11:                FAIR          BLUE       41.9           MALE
12:                FAIR          BLUE         64           MALE
13:             UNKNOWN          BLUE        200           MALE
14:                FAIR         BROWN         29           MALE
15:               GREEN         BLACK         44           MALE
16:    GREEN-TAN, BROWN        ORANGE        600 HERMAPHRODITIC
17:                FAIR         HAZEL         21           MALE
18:                FAIR          BLUE       <NA>           MALE
19:               GREEN         BROWN        896           MALE
20:                PALE        YELLOW         82           MALE
21:                FAIR         BROWN       31.5           MALE
22:               METAL           RED         15           NONE
23:               GREEN           RED         53           MALE
24:                DARK         BROWN         31           MALE
25:               LIGHT          BLUE         37           MALE
26:        BROWN MOTTLE        ORANGE         41           MALE
27:                FAIR          BLUE         48         FEMALE
28:                FAIR         BROWN       <NA>           MALE
29:               BROWN         BROWN          8           MALE
30:                GREY         BLACK       <NA>           MALE
31:                FAIR          BLUE         92           MALE
32:       MOTTLED GREEN           RED       <NA>           MALE
33:                FAIR          BLUE         91           MALE
34:              ORANGE        ORANGE         52           MALE
35:                GREY        ORANGE       <NA>           MALE
36:               GREEN        ORANGE       <NA>           MALE
37:                FAIR          BLUE       <NA>           <NA>
38:          BLUE, GREY        YELLOW       <NA>           MALE
39:           GREY, RED        ORANGE       <NA>           MALE
40:                DARK         BROWN         62           <NA>
41:                FAIR         BROWN         72         FEMALE
42:                 RED        YELLOW         54           MALE
43:                PALE          PINK       <NA>           MALE
44:                BLUE         HAZEL         48         FEMALE
45:          BLUE, GREY        YELLOW       <NA>           MALE
46:         WHITE, BLUE         BLACK       <NA>           MALE
47: GREY, GREEN, YELLOW        ORANGE       <NA>           MALE
48:                DARK         BROWN         72           MALE
49:                PALE        YELLOW         92           MALE
50:               GREEN         BLACK       <NA>           MALE
51:               BROWN         BROWN       <NA>           MALE
52:                DARK          BLUE       <NA>         FEMALE
53:                PALE        ORANGE       <NA>           MALE
54:               WHITE        YELLOW       <NA>           MALE
55:              ORANGE         BLACK         22           MALE
56:                BLUE          BLUE       <NA>           MALE
57:                DARK         BROWN       <NA>           MALE
58:               LIGHT         BROWN       <NA>         FEMALE
59:                FAIR          BLUE         82           MALE
60:               GREEN        YELLOW       <NA>           MALE
61:              YELLOW          BLUE         58         FEMALE
62:              YELLOW          BLUE         40         FEMALE
63:               LIGHT         BROWN       <NA>         FEMALE
64:                FAIR         BROWN        102           MALE
65:                 TAN         BROWN         67           MALE
66:                 TAN         BROWN         66           MALE
67: FAIR, GREEN, YELLOW        YELLOW       <NA>         FEMALE
68:               BROWN        YELLOW       <NA>           MALE
69:                GREY         BLACK       <NA>           MALE
70:                GREY         BLACK       <NA>         FEMALE
71:                FAIR          BLUE       <NA>         FEMALE
72:          GREY, BLUE       UNKNOWN       <NA>           MALE
73:         SILVER, RED     RED, BLUE       <NA>           NONE
74:         GREEN, GREY       UNKNOWN       <NA>           MALE
75:                GREY          GOLD       <NA>           MALE
76:    RED, BLUE, WHITE         BLACK       <NA>         FEMALE
77:        BROWN, WHITE GREEN, YELLOW       <NA>           MALE
78:               BROWN          BLUE       <NA>           MALE
79:               LIGHT         BROWN       <NA>           MALE
80:                PALE         WHITE       <NA>           <NA>
81:                GREY         BLACK       <NA>           MALE
82:                DARK          DARK       <NA>           MALE
83:               LIGHT         HAZEL       <NA>         FEMALE
84:               LIGHT         BROWN       <NA>           MALE
85:                NONE         BLACK       <NA>           NONE
86:             UNKNOWN       UNKNOWN       <NA>           <NA>
87:               LIGHT         BROWN         46         FEMALE
             SKIN_COLOR     EYE_COLOR BIRTH_YEAR            SEX
       GENDER      HOMEWORLD        SPECIES
 1: MASCULINE       TATOOINE          HUMAN
 2: MASCULINE       TATOOINE          DROID
 3: MASCULINE          NABOO          DROID
 4: MASCULINE       TATOOINE          HUMAN
 5:  FEMININE       ALDERAAN          HUMAN
 6: MASCULINE       TATOOINE          HUMAN
 7:  FEMININE       TATOOINE          HUMAN
 8: MASCULINE       TATOOINE          DROID
 9: MASCULINE       TATOOINE          HUMAN
10: MASCULINE        STEWJON          HUMAN
11: MASCULINE       TATOOINE          HUMAN
12: MASCULINE         ERIADU          HUMAN
13: MASCULINE       KASHYYYK        WOOKIEE
14: MASCULINE       CORELLIA          HUMAN
15: MASCULINE          RODIA         RODIAN
16: MASCULINE      NAL HUTTA           HUTT
17: MASCULINE       CORELLIA          HUMAN
18: MASCULINE     BESTINE IV          HUMAN
19: MASCULINE           <NA> YODA'S SPECIES
20: MASCULINE          NABOO          HUMAN
21: MASCULINE         KAMINO          HUMAN
22: MASCULINE           <NA>          DROID
23: MASCULINE      TRANDOSHA     TRANDOSHAN
24: MASCULINE        SOCORRO          HUMAN
25: MASCULINE         BESPIN          HUMAN
26: MASCULINE       MON CALA   MON CALAMARI
27:  FEMININE      CHANDRILA          HUMAN
28: MASCULINE           <NA>          HUMAN
29: MASCULINE          ENDOR           EWOK
30: MASCULINE        SULLUST      SULLUSTAN
31: MASCULINE           <NA>          HUMAN
32: MASCULINE CATO NEIMOIDIA      NEIMODIAN
33: MASCULINE      CORUSCANT          HUMAN
34: MASCULINE          NABOO         GUNGAN
35: MASCULINE          NABOO         GUNGAN
36: MASCULINE          NABOO         GUNGAN
37:      <NA>          NABOO           <NA>
38: MASCULINE       TOYDARIA      TOYDARIAN
39: MASCULINE      MALASTARE            DUG
40:      <NA>          NABOO           <NA>
41:  FEMININE       TATOOINE          HUMAN
42: MASCULINE       DATHOMIR         ZABRAK
43: MASCULINE         RYLOTH        TWI'LEK
44:  FEMININE         RYLOTH        TWI'LEK
45: MASCULINE        VULPTER     VULPTEREEN
46: MASCULINE        TROIKEN          XEXTO
47: MASCULINE           TUND          TOONG
48: MASCULINE     HARUUN KAL          HUMAN
49: MASCULINE          CEREA         CEREAN
50: MASCULINE    GLEE ANSELM       NAUTOLAN
51: MASCULINE       IRIDONIA         ZABRAK
52:  FEMININE      CORUSCANT     THOLOTHIAN
53: MASCULINE        IKTOTCH       IKTOTCHI
54: MASCULINE        QUERMIA       QUERMIAN
55: MASCULINE          DORIN        KEL DOR
56: MASCULINE       CHAMPALA       CHAGRIAN
57: MASCULINE          NABOO          HUMAN
58:  FEMININE          NABOO          HUMAN
59: MASCULINE       TATOOINE          HUMAN
60: MASCULINE       GEONOSIS      GEONOSIAN
61:  FEMININE         MIRIAL       MIRIALAN
62:  FEMININE         MIRIAL       MIRIALAN
63:  FEMININE          NABOO          HUMAN
64: MASCULINE        SERENNO          HUMAN
65: MASCULINE       ALDERAAN          HUMAN
66: MASCULINE   CONCORD DAWN          HUMAN
67:  FEMININE          ZOLAN       CLAWDITE
68: MASCULINE           OJOM       BESALISK
69: MASCULINE         KAMINO       KAMINOAN
70:  FEMININE         KAMINO       KAMINOAN
71:  FEMININE      CORUSCANT          HUMAN
72: MASCULINE    ALEEN MINOR         ALEENA
73:  FEMININE           <NA>          DROID
74: MASCULINE          SKAKO        SKAKOAN
75: MASCULINE     MUUNILINST           MUUN
76:  FEMININE          SHILI        TOGRUTA
77: MASCULINE          KALEE        KALEESH
78: MASCULINE       KASHYYYK        WOOKIEE
79: MASCULINE       ALDERAAN          HUMAN
80:      <NA>         UMBARA           <NA>
81: MASCULINE         UTAPAU         PAU'AN
82: MASCULINE           <NA>          HUMAN
83:  FEMININE           <NA>          HUMAN
84: MASCULINE           <NA>          HUMAN
85: MASCULINE           <NA>          DROID
86:      <NA>           <NA>           <NA>
87:  FEMININE          NABOO          HUMAN
       GENDER      HOMEWORLD        SPECIES
                                                                                                                                                         FILMS
 1:                                               C("THE EMPIRE STRIKES BACK", "REVENGE OF THE SITH", "RETURN OF THE JEDI", "A NEW HOPE", "THE FORCE AWAKENS")
 2:                      C("THE EMPIRE STRIKES BACK", "ATTACK OF THE CLONES", "THE PHANTOM MENACE", "REVENGE OF THE SITH", "RETURN OF THE JEDI", "A NEW HOPE")
 3: C("THE EMPIRE STRIKES BACK", "ATTACK OF THE CLONES", "THE PHANTOM MENACE", "REVENGE OF THE SITH", "RETURN OF THE JEDI", "A NEW HOPE", "THE FORCE AWAKENS")
 4:                                                                    C("THE EMPIRE STRIKES BACK", "REVENGE OF THE SITH", "RETURN OF THE JEDI", "A NEW HOPE")
 5:                                               C("THE EMPIRE STRIKES BACK", "REVENGE OF THE SITH", "RETURN OF THE JEDI", "A NEW HOPE", "THE FORCE AWAKENS")
 6:                                                                                             C("ATTACK OF THE CLONES", "REVENGE OF THE SITH", "A NEW HOPE")
 7:                                                                                             C("ATTACK OF THE CLONES", "REVENGE OF THE SITH", "A NEW HOPE")
 8:                                                                                                                                                 A NEW HOPE
 9:                                                                                                                                                 A NEW HOPE
10:                      C("THE EMPIRE STRIKES BACK", "ATTACK OF THE CLONES", "THE PHANTOM MENACE", "REVENGE OF THE SITH", "RETURN OF THE JEDI", "A NEW HOPE")
11:                                                                                     C("ATTACK OF THE CLONES", "THE PHANTOM MENACE", "REVENGE OF THE SITH")
12:                                                                                                                     C("REVENGE OF THE SITH", "A NEW HOPE")
13:                                               C("THE EMPIRE STRIKES BACK", "REVENGE OF THE SITH", "RETURN OF THE JEDI", "A NEW HOPE", "THE FORCE AWAKENS")
14:                                                                      C("THE EMPIRE STRIKES BACK", "RETURN OF THE JEDI", "A NEW HOPE", "THE FORCE AWAKENS")
15:                                                                                                                                                 A NEW HOPE
16:                                                                                                C("THE PHANTOM MENACE", "RETURN OF THE JEDI", "A NEW HOPE")
17:                                                                                           C("THE EMPIRE STRIKES BACK", "RETURN OF THE JEDI", "A NEW HOPE")
18:                                                                                                                                                 A NEW HOPE
19:                                    C("THE EMPIRE STRIKES BACK", "ATTACK OF THE CLONES", "THE PHANTOM MENACE", "REVENGE OF THE SITH", "RETURN OF THE JEDI")
20:                                    C("THE EMPIRE STRIKES BACK", "ATTACK OF THE CLONES", "THE PHANTOM MENACE", "REVENGE OF THE SITH", "RETURN OF THE JEDI")
21:                                                                                 C("THE EMPIRE STRIKES BACK", "ATTACK OF THE CLONES", "RETURN OF THE JEDI")
22:                                                                                                                                    THE EMPIRE STRIKES BACK
23:                                                                                                                                    THE EMPIRE STRIKES BACK
24:                                                                                                         C("THE EMPIRE STRIKES BACK", "RETURN OF THE JEDI")
25:                                                                                                                                    THE EMPIRE STRIKES BACK
26:                                                                                                               C("RETURN OF THE JEDI", "THE FORCE AWAKENS")
27:                                                                                                                                         RETURN OF THE JEDI
28:                                                                                                                                         RETURN OF THE JEDI
29:                                                                                                                                         RETURN OF THE JEDI
30:                                                                                                                                         RETURN OF THE JEDI
31:                                                                                                                                         THE PHANTOM MENACE
32:                                                                                     C("ATTACK OF THE CLONES", "THE PHANTOM MENACE", "REVENGE OF THE SITH")
33:                                                                                                                                         THE PHANTOM MENACE
34:                                                                                                            C("ATTACK OF THE CLONES", "THE PHANTOM MENACE")
35:                                                                                                                                         THE PHANTOM MENACE
36:                                                                                                                                         THE PHANTOM MENACE
37:                                                                                                                                         THE PHANTOM MENACE
38:                                                                                                            C("ATTACK OF THE CLONES", "THE PHANTOM MENACE")
39:                                                                                                                                         THE PHANTOM MENACE
40:                                                                                                                                         THE PHANTOM MENACE
41:                                                                                                            C("ATTACK OF THE CLONES", "THE PHANTOM MENACE")
42:                                                                                                                                         THE PHANTOM MENACE
43:                                                                                                                                         RETURN OF THE JEDI
44:                                                                                     C("ATTACK OF THE CLONES", "THE PHANTOM MENACE", "REVENGE OF THE SITH")
45:                                                                                                                                         THE PHANTOM MENACE
46:                                                                                                                                         THE PHANTOM MENACE
47:                                                                                                                                         THE PHANTOM MENACE
48:                                                                                     C("ATTACK OF THE CLONES", "THE PHANTOM MENACE", "REVENGE OF THE SITH")
49:                                                                                     C("ATTACK OF THE CLONES", "THE PHANTOM MENACE", "REVENGE OF THE SITH")
50:                                                                                     C("ATTACK OF THE CLONES", "THE PHANTOM MENACE", "REVENGE OF THE SITH")
51:                                                                                                             C("THE PHANTOM MENACE", "REVENGE OF THE SITH")
52:                                                                                                             C("THE PHANTOM MENACE", "REVENGE OF THE SITH")
53:                                                                                                             C("THE PHANTOM MENACE", "REVENGE OF THE SITH")
54:                                                                                                                                         THE PHANTOM MENACE
55:                                                                                     C("ATTACK OF THE CLONES", "THE PHANTOM MENACE", "REVENGE OF THE SITH")
56:                                                                                                            C("ATTACK OF THE CLONES", "THE PHANTOM MENACE")
57:                                                                                                                                       ATTACK OF THE CLONES
58:                                                                                                                                       ATTACK OF THE CLONES
59:                                                                                                                                       ATTACK OF THE CLONES
60:                                                                                                           C("ATTACK OF THE CLONES", "REVENGE OF THE SITH")
61:                                                                                                           C("ATTACK OF THE CLONES", "REVENGE OF THE SITH")
62:                                                                                                                                       ATTACK OF THE CLONES
63:                                                                                                                                       ATTACK OF THE CLONES
64:                                                                                                           C("ATTACK OF THE CLONES", "REVENGE OF THE SITH")
65:                                                                                                           C("ATTACK OF THE CLONES", "REVENGE OF THE SITH")
66:                                                                                                                                       ATTACK OF THE CLONES
67:                                                                                                                                       ATTACK OF THE CLONES
68:                                                                                                                                       ATTACK OF THE CLONES
69:                                                                                                                                       ATTACK OF THE CLONES
70:                                                                                                                                       ATTACK OF THE CLONES
71:                                                                                                                                       ATTACK OF THE CLONES
72:                                                                                                                                         THE PHANTOM MENACE
73:                                                                                                           C("ATTACK OF THE CLONES", "REVENGE OF THE SITH")
74:                                                                                                                                       ATTACK OF THE CLONES
75:                                                                                                                                       ATTACK OF THE CLONES
76:                                                                                                           C("ATTACK OF THE CLONES", "REVENGE OF THE SITH")
77:                                                                                                                                        REVENGE OF THE SITH
78:                                                                                                                                        REVENGE OF THE SITH
79:                                                                                                                     C("REVENGE OF THE SITH", "A NEW HOPE")
80:                                                                                                           C("ATTACK OF THE CLONES", "REVENGE OF THE SITH")
81:                                                                                                                                        REVENGE OF THE SITH
82:                                                                                                                                          THE FORCE AWAKENS
83:                                                                                                                                          THE FORCE AWAKENS
84:                                                                                                                                          THE FORCE AWAKENS
85:                                                                                                                                          THE FORCE AWAKENS
86:                                                                                                                                          THE FORCE AWAKENS
87:                                                                                     C("ATTACK OF THE CLONES", "THE PHANTOM MENACE", "REVENGE OF THE SITH")
                                                                                                                                                         FILMS
                                       VEHICLES
 1:   C("SNOWSPEEDER", "IMPERIAL SPEEDER BIKE")
 2:                                CHARACTER(0)
 3:                                CHARACTER(0)
 4:                                CHARACTER(0)
 5:                       IMPERIAL SPEEDER BIKE
 6:                                CHARACTER(0)
 7:                                CHARACTER(0)
 8:                                CHARACTER(0)
 9:                                CHARACTER(0)
10:                             TRIBUBBLE BONGO
11: C("ZEPHYR-G SWOOP BIKE", "XJ-6 AIRSPEEDER")
12:                                CHARACTER(0)
13:                                       AT-ST
14:                                CHARACTER(0)
15:                                CHARACTER(0)
16:                                CHARACTER(0)
17:                                 SNOWSPEEDER
18:                                CHARACTER(0)
19:                                CHARACTER(0)
20:                                CHARACTER(0)
21:                                CHARACTER(0)
22:                                CHARACTER(0)
23:                                CHARACTER(0)
24:                                CHARACTER(0)
25:                                CHARACTER(0)
26:                                CHARACTER(0)
27:                                CHARACTER(0)
28:                                CHARACTER(0)
29:                                CHARACTER(0)
30:                                CHARACTER(0)
31:                             TRIBUBBLE BONGO
32:                                CHARACTER(0)
33:                                CHARACTER(0)
34:                                CHARACTER(0)
35:                                CHARACTER(0)
36:                                CHARACTER(0)
37:                                CHARACTER(0)
38:                                CHARACTER(0)
39:                                CHARACTER(0)
40:                                CHARACTER(0)
41:                                CHARACTER(0)
42:                                SITH SPEEDER
43:                                CHARACTER(0)
44:                                CHARACTER(0)
45:                                CHARACTER(0)
46:                                CHARACTER(0)
47:                                CHARACTER(0)
48:                                CHARACTER(0)
49:                                CHARACTER(0)
50:                                CHARACTER(0)
51:                                CHARACTER(0)
52:                                CHARACTER(0)
53:                                CHARACTER(0)
54:                                CHARACTER(0)
55:                                CHARACTER(0)
56:                                CHARACTER(0)
57:                                CHARACTER(0)
58:                                CHARACTER(0)
59:                                CHARACTER(0)
60:                                CHARACTER(0)
61:                                CHARACTER(0)
62:                                CHARACTER(0)
63:                                CHARACTER(0)
64:                            FLITKNOT SPEEDER
65:                                CHARACTER(0)
66:                                CHARACTER(0)
67:                  KORO-2 EXODRIVE AIRSPEEDER
68:                                CHARACTER(0)
69:                                CHARACTER(0)
70:                                CHARACTER(0)
71:                                CHARACTER(0)
72:                                CHARACTER(0)
73:                                CHARACTER(0)
74:                                CHARACTER(0)
75:                                CHARACTER(0)
76:                                CHARACTER(0)
77:                 TSMEU-6 PERSONAL WHEEL BIKE
78:                                CHARACTER(0)
79:                                CHARACTER(0)
80:                                CHARACTER(0)
81:                                CHARACTER(0)
82:                                CHARACTER(0)
83:                                CHARACTER(0)
84:                                CHARACTER(0)
85:                                CHARACTER(0)
86:                                CHARACTER(0)
87:                                CHARACTER(0)
                                       VEHICLES
                                                                                                                STARSHIPS
 1:                                                                                       C("X-WING", "IMPERIAL SHUTTLE")
 2:                                                                                                          CHARACTER(0)
 3:                                                                                                          CHARACTER(0)
 4:                                                                                                       TIE ADVANCED X1
 5:                                                                                                          CHARACTER(0)
 6:                                                                                                          CHARACTER(0)
 7:                                                                                                          CHARACTER(0)
 8:                                                                                                          CHARACTER(0)
 9:                                                                                                                X-WING
10: C("JEDI STARFIGHTER", "TRADE FEDERATION CRUISER", "NABOO STAR SKIFF", "JEDI INTERCEPTOR", "BELBULLAB-22 STARFIGHTER")
11:                                                    C("TRADE FEDERATION CRUISER", "JEDI INTERCEPTOR", "NABOO FIGHTER")
12:                                                                                                          CHARACTER(0)
13:                                                                            C("MILLENNIUM FALCON", "IMPERIAL SHUTTLE")
14:                                                                            C("MILLENNIUM FALCON", "IMPERIAL SHUTTLE")
15:                                                                                                          CHARACTER(0)
16:                                                                                                          CHARACTER(0)
17:                                                                                                                X-WING
18:                                                                                                                X-WING
19:                                                                                                          CHARACTER(0)
20:                                                                                                          CHARACTER(0)
21:                                                                                                               SLAVE 1
22:                                                                                                          CHARACTER(0)
23:                                                                                                          CHARACTER(0)
24:                                                                                                     MILLENNIUM FALCON
25:                                                                                                          CHARACTER(0)
26:                                                                                                          CHARACTER(0)
27:                                                                                                          CHARACTER(0)
28:                                                                                                                A-WING
29:                                                                                                          CHARACTER(0)
30:                                                                                                     MILLENNIUM FALCON
31:                                                                                                          CHARACTER(0)
32:                                                                                                          CHARACTER(0)
33:                                                                                                          CHARACTER(0)
34:                                                                                                          CHARACTER(0)
35:                                                                                                          CHARACTER(0)
36:                                                                                                          CHARACTER(0)
37:                                                                                                  NABOO ROYAL STARSHIP
38:                                                                                                          CHARACTER(0)
39:                                                                                                          CHARACTER(0)
40:                                                                                                          CHARACTER(0)
41:                                                                                                          CHARACTER(0)
42:                                                                                                              SCIMITAR
43:                                                                                                          CHARACTER(0)
44:                                                                                                          CHARACTER(0)
45:                                                                                                          CHARACTER(0)
46:                                                                                                          CHARACTER(0)
47:                                                                                                          CHARACTER(0)
48:                                                                                                          CHARACTER(0)
49:                                                                                                          CHARACTER(0)
50:                                                                                                          CHARACTER(0)
51:                                                                                                          CHARACTER(0)
52:                                                                                                          CHARACTER(0)
53:                                                                                                          CHARACTER(0)
54:                                                                                                          CHARACTER(0)
55:                                                                                                      JEDI STARFIGHTER
56:                                                                                                          CHARACTER(0)
57:                                                                                                         NABOO FIGHTER
58:                                                                                                          CHARACTER(0)
59:                                                                                                          CHARACTER(0)
60:                                                                                                          CHARACTER(0)
61:                                                                                                          CHARACTER(0)
62:                                                                                                          CHARACTER(0)
63:                                                                                                          CHARACTER(0)
64:                                                                                                          CHARACTER(0)
65:                                                                                                          CHARACTER(0)
66:                                                                                                          CHARACTER(0)
67:                                                                                                          CHARACTER(0)
68:                                                                                                          CHARACTER(0)
69:                                                                                                          CHARACTER(0)
70:                                                                                                          CHARACTER(0)
71:                                                                                                          CHARACTER(0)
72:                                                                                                          CHARACTER(0)
73:                                                                                                          CHARACTER(0)
74:                                                                                                          CHARACTER(0)
75:                                                                                                          CHARACTER(0)
76:                                                                                                          CHARACTER(0)
77:                                                                                              BELBULLAB-22 STARFIGHTER
78:                                                                                                          CHARACTER(0)
79:                                                                                                          CHARACTER(0)
80:                                                                                                          CHARACTER(0)
81:                                                                                                          CHARACTER(0)
82:                                                                                                          CHARACTER(0)
83:                                                                                                          CHARACTER(0)
84:                                                                                                   T-70 X-WING FIGHTER
85:                                                                                                          CHARACTER(0)
86:                                                                                                          CHARACTER(0)
87:                                                         C("H-TYPE NUBIAN YACHT", "NABOO STAR SKIFF", "NABOO FIGHTER")
                                                                                                                STARSHIPS

Exercise 3

stormsDT <- as.data.table(storms)
numericCols = sapply(stormsDT, is.numeric)
numericCols = names(numericCols)[numericCols]
stormsDTNew <- na.omit(stormsDT, cols = numericCols)
characterCols <- sapply(stormsDTNew, is.character)
characterCols <- names(characterCols)[characterCols]
colsOfInterest <- c("ts_diameter", "hu_diameter")
stormsDTNew[, lapply(.SD, mean), .SDcols = colsOfInterest, by = characterCols]
        name              status ts_diameter hu_diameter
  1:    Alex tropical depression     0.00000     0.00000
  2:    Alex      tropical storm   150.80013     0.00000
  3:    Alex           hurricane   304.40871    63.56690
  4:  Bonnie tropical depression     0.00000     0.00000
  5:  Bonnie      tropical storm    75.40637     0.00000
 ---                                                    
273: Joaquin      tropical storm   198.92054     0.00000
274: Joaquin           hurricane   332.98376    85.00923
275:    Kate tropical depression     0.00000     0.00000
276:    Kate      tropical storm    85.48651     0.00000
277:    Kate           hurricane   273.31025    31.64645

.SD: Subset of data - choose row by ordinal position

irisDT[, .SD[1L]]
   Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
1:            6         2.2            5         1.5 virginica
   maxLength minWidth Sepal.Length.Width
1:       7.9      0.1              6/2.2
irisDT[, .SD[1L], by = Species]
      Species Sepal.Length Sepal.Width Petal.Length Petal.Width
1:  virginica            6         2.2          5.0         1.5
2: versicolor            5           2          3.5         1.0
3:     setosa          4.5         2.3          1.3         0.3
   maxLength minWidth Sepal.Length.Width
1:       7.9      0.1              6/2.2
2:       7.9      0.1                5/2
3:       7.9      0.1            4.5/2.3
irisDT[, .SD[1L], by = Species, .SDcols = c("Petal.Length", "Petal.Width")]
      Species Petal.Length Petal.Width
1:  virginica          5.0         1.5
2: versicolor          3.5         1.0
3:     setosa          1.3         0.3

.SD: Subset of Data - select top entries (in each group)

irisDT[order(-Sepal.Length), .SD[1:2]]
   Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
1:          7.9         3.8          6.4         2.0 virginica
2:          7.7         2.6          6.9         2.3 virginica
   maxLength minWidth Sepal.Length.Width
1:       7.9      0.1            7.9/3.8
2:       7.9      0.1            7.7/2.6
irisDT[, .SD[which.max(Sepal.Length)], by = Species, .SDcols = c("Sepal.Length","Sepal.Width")]
      Species Sepal.Length Sepal.Width
1:  virginica          7.9         3.8
2: versicolor            7         3.2
3:     setosa          5.8           4

Multiple if else statements

irisDT[, text := {
  if(Sepal.Length > 4 & Sepal.Width > 4) "sepal length and width is larger than 4"
  else if(Petal.Length < 2 & Petal.Width < 2) "petal length and width is smaller than 2"
  else "other"
}, by = .(Species, Sepal.Length, Sepal.Width, Petal.Length, Petal.Width)][, .SD[1], by = text]
                                       text Sepal.Length Sepal.Width
1:                                    other            6         2.2
2: petal length and width is smaller than 2          4.5         2.3
3:  sepal length and width is larger than 4          5.2         4.1
   Petal.Length Petal.Width   Species maxLength minWidth
1:          5.0         1.5 virginica       7.9      0.1
2:          1.3         0.3    setosa       7.9      0.1
3:          1.5         0.1    setosa       7.9      0.1
   Sepal.Length.Width
1:              6/2.2
2:            4.5/2.3
3:            5.2/4.1
bin_iris <- function(sl, sw, pl, pw){
  if(sl > 4 & sw > 4) "sepal length and width is larger than 4"
  else if(pl < 2 & pw < 2) "petal length and width is smaller than 2"
  else "other"
}
irisDT[, text := bin_iris(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width), 
       by = .(Species, Sepal.Length, Sepal.Width, Petal.Length, Petal.Width)][, .SD[1], by = text]
                                       text Sepal.Length Sepal.Width
1:                                    other            6         2.2
2: petal length and width is smaller than 2          4.5         2.3
3:  sepal length and width is larger than 4          5.2         4.1
   Petal.Length Petal.Width   Species maxLength minWidth
1:          5.0         1.5 virginica       7.9      0.1
2:          1.3         0.3    setosa       7.9      0.1
3:          1.5         0.1    setosa       7.9      0.1
   Sepal.Length.Width
1:              6/2.2
2:            4.5/2.3
3:            5.2/4.1

Hands - on 4

Exercise 1

Print the fifth row for each species.

starwarsDT <- as.data.table(starwars)
starwarsDT[, if(.N >= 5) .SD[5L], by = species]
   species               name height mass hair_color  skin_color
1:   Human Beru Whitesun lars    165   75      brown       light
2:   Droid             R4-P17     96   NA       none silver, red
   eye_color birth_year    sex   gender homeworld
1:      blue         47 female feminine  Tatooine
2: red, blue         NA   none feminine      <NA>
                                                 films vehicles
1: Attack of the Clones,Revenge of the Sith,A New Hope         
2:            Attack of the Clones,Revenge of the Sith         
   starships
1:          
2:          

Exercise 2

stormsDT <- as.data.table(storms)
stormsDT[, .(name, status, pressure)][order(name, pressure)][, .SD[1:min(5, .N)], by = name]
         name              status pressure
  1: AL011993 tropical depression      999
  2: AL011993 tropical depression      999
  3: AL011993 tropical depression      999
  4: AL011993 tropical depression      999
  5: AL011993 tropical depression     1000
 ---                                      
978:     Zeta      tropical storm      994
979:     Zeta      tropical storm      994
980:     Zeta      tropical storm      994
981:     Zeta      tropical storm      994
982:     Zeta      tropical storm      994

Citation

For attribution, please cite this work as

Okola (2020, Dec. 4). Basil Okola: A deeper look into data.table functionalities. Retrieved from https://bokola214.netlify.app/posts/welcome/

BibTeX citation

@misc{okola2020a,
  author = {Okola, Basil},
  title = {Basil Okola: A deeper look into data.table functionalities},
  url = {https://bokola214.netlify.app/posts/welcome/},
  year = {2020}
}