I’ve been asked to compute some data and statistics from an Excel spreadsheet containing an huge phonebook
The operations I need wasn’t very complicated, like to find and remove duplicate rows and so on, but I didn’t find any quick way to achieve those simple tasks using just Excel (or OpenOffice and LibreOffice).
Since I’m good in SQL, I decided to move this data in a MySQL database, then I wondered what’s the simplest way to obtain this result.
Excel Data Structure
My excel data is on three simple columns:
- First Name
- Last Name
- Phone Number
I need to export my 20,000 rows in a .csv file.
Export XLS Data in a CSV
Using Microsoft Excel
From Excel, go in “Save As” then pick the option “Other Formats”, and from the combo box, choose Comma Delimited CSV.
Microsoft Excel by default creates values separated by a Comma, and the single column is not enclosed by any special char.
Using OpenOffice or LibreOffice
In OpenOffice, choose Save As and then CSV, using the default options the .csv file will have values separated by semicolon and enclosed by double quotes.
Create the MySQL table to import the CSV
It’s time to create the basic data structure in MySQL that will be able to host the data we exported from Excel. So the simple task is to generate a table with the same number of columns (and type) that will be associated to the Excel columns.
create table phonebook (first_name varchar(100), last_name varchar(100), phone_number varchar(100))
And now, the last step, importing the CSV in MySQL
Import the CSV (generated from an XLS) into MySQL table
Mysql offers a useful command for the operation of importing the CSV in a table, the command is LOAD DATA LOCAL INFILE.
And now the code in the case you exported the CSV from OpenOffice and the rows have the following structure:
“Mario”,”Rossi”,”+390123456789″
The code to load the data is:
load data local infile ‘phonebook.csv’ into table phonebook fields terminated by ‘,’ enclosed by ‘”‘ lines terminated by ‘\n’ (first_name, last_name, phone_number);
If you exported using Microsoft Office the rows have the following structure:
Mario;Rossi;+390123456789
The code to load the data is:
load data local infile 'phonebook.csv' into table phonebook fields terminated by ';' enclosed by '' lines terminated by '\n' (first_name, last_name, phone_number);
I value the blog article.Really looking forward to read more. Cool.