Read large csv files with dask dataframe quickly
import dask.dataframe as dd n = [“column1”, “column2”, “column3”, “column4”] df = dd.read_csv(‘D:/BigData/data1.csv’, assume_missing=True, names=n) print(df.head())
import dask.dataframe as dd n = [“column1”, “column2”, “column3”, “column4”] df = dd.read_csv(‘D:/BigData/data1.csv’, assume_missing=True, names=n) print(df.head())
I always keep forgetting how exactly with statement works with opening and reading the files in Python. path = ‘./path/filename.txt’ with open(path,’r’) as file: data = file.readlines() print(data) Or if you would like to avoid getting ‘\n’ after each line when using .readlines() you can use this instead: data = file.read().splitlines() Opening files with ‘with’ … Read more