Have you encountered situations where the same dataset suddenly starts “lagging” your application, bloating files, and turning simple data exchange into endless workarounds? It’s especially frustrating when the format is chosen “out of habit” and you end up paying for it with performance and development time.
This chapter reveals why professional developers don’t argue “XML vs JSON” but choose tools based on scenario — and thereby speed up parsing, reduce memory consumption, and get predictable behavior with large datasets. Here you’ll discover a non-obvious point: sometimes the “convenient” approach makes the system fragile, while a “slightly stricter” one actually saves the project.
We’ll cover 3 XML strategies in Qt (DOM, SAX, and QXmlStreamReader/QXmlStreamWriter as the recommended fast option), as well as practical work with QJsonDocument / QJsonObject / QJsonArray. And yes — Protobuf will show why binary serialization produces files 3–10× smaller than JSON and noticeably faster under load.
If you want to stop “guessing the format” and start choosing it confidently — this chapter is best not skipped.
This chapter includes ready-to-use code examples.
Chapter Self-Check
Why is DOM unsuitable for processing multi-gigabyte XML files, and what alternative exists?Answer
Correct answer: DOM loads the entire document into memory as a tree structure, which leads to RAM exhaustion with large files. For such cases, use a SAX parser, which processes XML sequentially and keeps only the current fragment in memory.
What is the fundamental difference between a node (QDomNode) and an element (QDomElement) in XML DOM?Answer
Correct answer: A node is the base class for any XML component (elements, text, comments, attributes), while an element is a specific type of node representing opening and closing tags. Any element is a node, but not every node is an element.
Why are QXmlStreamReader/QXmlStreamWriter recommended as the primary way to work with XML in Qt6 instead of DOM and SAX?Answer
Correct answer: These classes provide an optimal balance between programming convenience (like DOM) and efficient memory usage (like SAX), while offering higher performance and a simple API without the need to create special event handlers.
Why does Protocol Buffers use numeric field identifiers (=1, =2, =3) instead of field names like JSON?Answer
Correct answer: Numeric identifiers are used in binary format for compact data storage — they take much less space than text field names. This is one reason why Protobuf creates files 3-10 times smaller than JSON.
Which format and method should be chosen for importing 2 GB XML log files with limited RAM?Answer
Correct answer: Use a SAX parser, as it reads the file sequentially and generates events when encountering tags, keeping only the currently processed fragment in memory, allowing work with files of any size with minimal memory consumption.
What happens when calling the toElement() method on a node that is not an element, and how should it be handled correctly?Answer
Correct answer: The method will return a null value. Before using the result, you must always check it with the isNull() method or make sure in advance that the node is an element by calling isElement().
Why did JSON become the preferred format for REST APIs and web applications compared to XML?Answer
Correct answer: JSON is significantly more compact than XML, has a simple structure without redundant closing tags, is easily human-readable, and is natively supported by JavaScript, making data exchange between client and server maximally efficient.
Why doesn’t a SAX parser allow random access to XML elements, unlike DOM?Answer
Correct answer: SAX reads XML sequentially, processing one element at a time and not storing the document structure in memory. After processing an element, its data is no longer accessible, so it’s impossible to “go back” or access a random element without re-reading the file.
You’re developing a mobile app with limited bandwidth and high performance requirements. Which serialization format should you choose?Answer
Correct answer: Protocol Buffers — it creates the most compact files (3-10× bandwidth savings compared to JSON), provides maximum serialization/deserialization speed, and strict typing, which is critical for mobile applications.
Which Qt6 class is used to represent any value in JSON, and why is it universal?Answer
Correct answer: QJsonValue can contain a string, number, boolean, null, object, or array. It automatically determines the data type and provides methods for safe extraction (toString(), toInt(), toObject(), etc.).
Why is Protobuf data not human-readable, and when is this an advantage?Answer
Correct answer: Protobuf uses compact binary representation instead of text format, ensuring minimal size and maximum processing speed. This is an advantage for high-load systems, network protocols, and real-time systems where performance is more critical than readability.
For a web app with user settings, you need a format that’s easily readable by developers and doesn’t require additional tools. What to choose and why?Answer
Correct answer: JSON is ideal — it can be opened with any text editor, the structure is intuitive, no schema compilation required (unlike Protobuf) or processing redundant tags (unlike XML), and Qt provides a simple API for working with it.
What is “strict typing at compile time” in Protobuf, and why is it important?Answer
Correct answer: The proto file defines types for all fields (int32, string, etc.), and the protoc compiler generates C++ classes with strictly typed methods. This allows detecting type errors at compile time rather than runtime, as with JSON.
Practical Assignments
Easy Level
Address Book Converter XML → JSON
Create a program that reads an addressbook.xml file with contacts (using QXmlStreamReader) and saves the same data in JSON format (addressbook.json). The program should correctly handle the number attribute and all contact fields (name, phone, email).
Hints: Use QXmlStreamReader for sequential XML reading. Create a QJsonArray to store contacts, and represent each contact as a QJsonObject. Don’t forget the number attribute — you can get it via attributes().value(“number”). The final JSON structure should contain a “contacts” array in the root object.
Medium Level
Universal Parser with Method Selection
Develop a GUI application that allows loading an XML file and choosing its processing method (DOM, SAX, or Stream). The program should display statistics: processing time, memory used (approximately), and a list of all contacts. Add visualization of performance differences between methods.
Hints: Create a QComboBox for selecting parsing method. Use QElapsedTimer to measure time. For DOM memory estimation, count the number of created nodes. Implement three separate functions for each parsing method. Output results to QTextEdit and statistics to QLabel. Test with files of different sizes (create a test data generator).
Hard Level
Sync System with Multi-Format Support
Create an address book management application that can save and load data in three formats: XML, JSON, and Protobuf. Implement automatic format detection on load, comparison of file sizes and operation times, and an export function that saves data in all three formats simultaneously with performance report generation. Add the ability to edit contacts with changes saved in the selected format.
Hints: Create a base AbstractFormat class with virtual save() and load() methods. Implement three inheritors: XmlFormat, JsonFormat, ProtobufFormat. For format auto-detection, check the first bytes of the file or extension. Use QFileInfo::size() for size comparison. Create a ContactManager class to manage data independently of format. For Protobuf, you’ll need to create a .proto file and compile it. In the report, display: file size, write time, read time, size ratios. The GUI should include QTableView for contact list, operation buttons, and QTextEdit for the report.
💬 Join the Discussion!
Got a handle on the differences between DOM, SAX, and Stream? Have questions about when to use JSON versus Protobuf?
Share your experience optimizing large XML file handling, talk about real-world use cases for different formats, or help other readers choose the right solution for their projects!
🚀 Your experience matters! Every comment helps the community better understand the nuances of working with data formats in Qt.