Here’s my quick cheat-sheet on slicing columns from a Pandas dataframe. Consider this dataset:
1 2 3 4 5 6 7 8 | df = pd.DataFrame([ ["AA", "747", 1], ["AB", "A380", 1], ["AA", "737", 0], ["AB", "747", 1], ["AA", "737", 0] ], columns=["Airline", "Aircraft", "Class"]) |
It has 3 columns. You can use the iloc function to slice columns:
Airline | Aircraft | Class |
AA | 747 | 1 |
AB | A380 | 1 |
AA | 737 | 0 |
AB | 747 | 1 |
AA | 737 | 0 |
1st column only:
1 2 3 4 5 6 7 | df.iloc[:, 0:1].values array([['AA', '747'], ['AB', 'A380'], ['AA', '737'], ['AB', '747'], ['AA', '737']], dtype=object) |
1st through last columns (i.e. all columns):
1 2 3 4 5 6 7 8 | df.iloc[:, :].values df.iloc[:, 0:].values # Alternate syntax array([['AA', '747', 1], ['AB', 'A380', 1], ['AA', '737', 0], ['AB', '747', 1], ['AA', '737', 0]], dtype=object) |
2nd column only:
1 2 3 4 5 6 7 | df.iloc[:, 1:2].values array([['747'], ['A380'], ['737'], ['747'], ['737']], dtype=object) |
2nd through last columns:
1 2 3 4 5 6 7 | df.iloc[:, 1:].values array([['747', 1], ['A380', 1], ['737', 0], ['747', 1], ['737', 0]], dtype=object) |
1st through last-but-one:
1 2 3 4 5 6 7 8 | df.iloc[:, :-1].values df.iloc[:, 0:-1].values # Alternate syntax array([['AA', '747'], ['AB', 'A380'], ['AA', '737'], ['AB', '747'], ['AA', '737']], dtype=object) |
Last column only:
1 2 3 4 5 6 7 | df.iloc[:, -1:].values array([[1], [1], [0], [1], [0]]) |