CSV (Comma-Separated Values) files are the simplest way to represent tabular data. From spreadsheets to database exports and APIs, CSV is widely used due to its readability and portability. In Java, CSV handling is a critical skill for developers working with data pipelines, reporting systems, log analyzers, and ETL processes.
This tutorial covers reading and writing CSV files in Java, explaining I/O basics, performance considerations, advanced integrations with NIO.2, and real-world case studies.
Basics of Java I/O
Streams vs Readers/Writers
- InputStream/OutputStream → Work with raw binary data.
- Reader/Writer → Work with text data like CSV.
- BufferedReader/BufferedWriter → Improve efficiency when working with large CSV files.
Reading CSV Files
Using BufferedReader
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
public class CSVReaderExample {
public static void main(String[] args) {
String filePath = "data.csv";
try (BufferedReader br = new BufferedReader(new FileReader(filePath))) {
String line;
while ((line = br.readLine()) != null) {
String[] values = line.split(",");
for (String value : values) {
System.out.print(value + " | ");
}
System.out.println();
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
Using Scanner
import java.io.File;
import java.util.Scanner;
public class CSVScannerExample {
public static void main(String[] args) throws Exception {
try (Scanner scanner = new Scanner(new File("data.csv"))) {
while (scanner.hasNextLine()) {
String[] values = scanner.nextLine().split(",");
for (String value : values) {
System.out.print(value + " | ");
}
System.out.println();
}
}
}
}
Writing CSV Files
Using BufferedWriter
import java.io.BufferedWriter;
import java.io.FileWriter;
import java.io.IOException;
public class CSVWriterExample {
public static void main(String[] args) {
String filePath = "output.csv";
try (BufferedWriter bw = new BufferedWriter(new FileWriter(filePath))) {
bw.write("Name,Age,Country");
bw.newLine();
bw.write("Alice,30,USA");
bw.newLine();
bw.write("Bob,25,UK");
} catch (IOException e) {
e.printStackTrace();
}
}
}
Using PrintWriter
import java.io.PrintWriter;
public class CSVPrintWriterExample {
public static void main(String[] args) {
try (PrintWriter writer = new PrintWriter("output.csv")) {
writer.println("Name,Age,Country");
writer.println("Charlie,28,Canada");
writer.println("Diana,35,Germany");
} catch (Exception e) {
e.printStackTrace();
}
}
}
Intermediate Concepts
Buffered I/O
Always use buffering for large CSV files to reduce disk I/O operations.
RandomAccessFile for CSV
Rarely used but can update specific rows by seeking offsets.
Serialization
CSV is not natively serialized, but data from CSV files can be converted into Java objects (POJOs) and serialized.
Structured Data Formats
- CSV → Simple, tabular data.
- JSON/XML → Hierarchical, structured data.
- Use libraries like OpenCSV, Apache Commons CSV for advanced parsing.
Advanced I/O with NIO and NIO.2
Reading CSV with NIO.2
import java.nio.file.*;
import java.util.List;
public class NioCSVExample {
public static void main(String[] args) throws Exception {
Path path = Paths.get("data.csv");
List<String> lines = Files.readAllLines(path);
lines.forEach(System.out::println);
}
}
Streaming Large CSV Files
Files.lines(Paths.get("data.csv"))
.map(line -> line.split(","))
.forEach(values -> System.out.println(String.join(" | ", values)));
Channels and Buffers
For very large CSV files (GB scale), use FileChannel
and ByteBuffer
for performance.
Performance & Best Practices
- Use BufferedReader/BufferedWriter for efficiency.
- Use libraries (OpenCSV, Apache Commons CSV) for production-grade parsing.
- Always close resources using try-with-resources.
- Handle character encodings explicitly (
UTF-8
preferred). - Validate CSV inputs to avoid malformed data.
Framework Case Studies
- Spring Boot: Uses CSV for batch data import/export.
- Hibernate: Populates DB from CSV seeds.
- Log4j: Generates CSV log appenders.
- Microservices: Share CSVs as lightweight data exchange.
- ETL Pipelines: Use CSV for staging and transformation.
Real-World Scenarios
- Data Import/Export: CSV ↔ Database.
- Report Generation: Export reports as CSV.
- Log Analysis: Parse server logs stored as CSV.
- ETL Pipelines: Process and transform tabular data.
- API Responses: Some REST APIs provide CSV downloads.
📌 What's New in Java Versions?
- Java 7+: NIO.2
Path
andFiles
APIs simplify CSV handling. - Java 8: Streams API enables functional-style parsing (
Files.lines
). - Java 11:
Files.readString
/writeString
convenience methods. - Java 17: NIO performance optimizations.
- Java 21: Virtual threads simplify file-heavy CSV workflows.
Conclusion & Key Takeaways
CSV file handling in Java is a fundamental skill for developers working with data. With Readers, Writers, and modern NIO.2 APIs, developers can build efficient and scalable CSV import/export systems.
Key Takeaways:
- Use
BufferedReader/Writer
for basic CSV operations. - Leverage libraries like OpenCSV for complex parsing.
- Use NIO.2 and Streams for scalability.
- Always validate and encode data properly.
FAQ
Q1. What’s the difference between CSV and JSON/XML?
A: CSV is flat and tabular, JSON/XML support hierarchical data.
Q2. Can I parse CSV manually in Java?
A: Yes, but libraries handle edge cases like quoted fields.
Q3. What’s the best library for CSV in Java?
A: OpenCSV and Apache Commons CSV are widely used.
Q4. Can Scanner read CSV files?
A: Yes, but BufferedReader is faster for large files.
Q5. How do I handle different delimiters (e.g., tab, semicolon)?
A: Use split("\t")
for tab-delimited, or libraries for flexibility.
Q6. Is RandomAccessFile good for CSV updates?
A: Only if fixed-length rows exist; otherwise inefficient.
Q7. How do I handle encodings in CSV?
A: Specify charset in Readers/Writers (UTF-8
).
Q8. Can I use Java Streams to process CSV?
A: Yes, Files.lines()
is ideal for large files.
Q9. Can I lock CSV files to prevent concurrent access?
A: Yes, use FileChannel.lock()
.
Q10. How do frameworks like Spring Boot handle CSV?
A: Via libraries (OpenCSV, Jackson CSV) integrated into batch processing.