- Add a new field position to the json_error_t structure. This is the
position in bytes from the beginning of the input.
- Keep track of line, column and input position in the stream level.
Previously, only line was tracked, and it was in the lexer level, so
this info was not available for UTF-8 decoding errors.
- While at it, refactor tests so that no separate "stripped" tests are
required. json_process is now able to strip whitespace from its
input, and the "valid" and "invalid" test suites now use this to
test both non-stripped and stripped input.
Closes GH-9.
Thanks to Basile Starynkevitch for the suggestion and initial patch.
Thanks to Jonathan Landis and Deron Meranda for showing how this can
be utilized for implementing secure memory operations.
Functions taking va_args are munged to receive arguments of type
'__va_list_tag *'. This patch uses va_copy to coerce them to the expected type
so we don't get compiler errors.
Tested on x86_64, both 32-bit and 64-bit compiles.
Reported-By: Basile Starynkevitch <basile@starynkevitch.net>
Now, by default, unpacking doesn't check that all array and object
items are accessed. The check can be enabled globally by using the
JSON_STRICT flag (formerly JSON_UNPACK_ONLY), or on a per-value basis
by using the new '!' format character. The '*' format character is
still available to disable the global check on a per-value basis.
Expand the pack/unpack API: json_(un)pack is the simple version with
no error output, json_(un)pack_ex has error output and flags,
json_v(un)pack_ex is a va_list version of json_(un)pack_ex.
Implement unpacking flags:
- JSON_UNPACK_ONLY turns off extra validation, i.e. array and object
unpacking doesn't check that all items are unpacked. This is really
just convenience for not adding '*' after each object and array.
- JSON_VALIDATE_ONLY turns off unpacking, i.e. no varargs are expected
and nothing is unpacked.
* By default, json_unpack() now checks that all items of arrays and
objects are unpacked. This is useful for validation.
* Add format specifier '*' to suppress this check for individual
arrays and objects. '*' must appear as the last format specifier
before the closing ']' or '}'.
* Use the format string scanner from the previous commit (with the
simpler separator skipping semantics)
* Split json_vnunpack() to three separate functions for unpacking
objects, arrays and "simple" values
* Always return 0 on success, 1 on error
This shaves around 100 more lines from the original implementation.
* Implement a "scanner" that reads the format string, maintaining state
* Split json_vnpack() to three separate functions for packing objects,
arrays and simple values. This makes it more clear what is being
packed, and the object and array structures become more evident.
* Make the skipping of ignored character simpler, i.e. skip ':' and
',' independent of their context
This patch shaves around 80 lines of code from the original
implementation.
Note that we pass va_list pointers around instead of just va_lists, which
would seem more intuitive. This is necessary since the behaviour of va_lists
passed as function parameters is finicky. Quoth stdarg(3):
If ap is passed to a function that uses va_arg(ap,type) then the value
of ap is undefined after the return of that function.
The pointer-passing strategy is used by Python's Py_BuildValue() for the same
purpose.
This patch adds two new fields to the json_error_t struct: column and
source. It also adds functions to populate json_error_t internally.
The column field is not currently used, but it will be utilized in the
decoder and pack/unpack functions.
After looking at the new code for a few days, I didn't like it
anymore. To prepare for the future, a few fields will be added to the
json_error_t struct later.
This reverts commit 23dd078c8d. Some
adjustments were needed because of newer commits.
All decoding functions now accept a json_error_t** parameter and set
it to point to a heap-allocated json_error_t structure if an error
occurs. The contents of json_error_t are no longer exposed directly, a
few functions to do it have been added instead. If an error occurs,
the user must free the json_error_t value.
This makes it possible to enhance the error reporting facilities in
the future without breaking ABI compatibility with older versions.
This is a backwards incompatible change.
As of now, the parameter is unused, but may be needed in the future.
I'm adding it now so that in the future both API and ABI remain
backwards compatible as long as possible.
This is a backwards incompatible change.
This is to free up bits from the flags parameter of json_dump
functions. I'm pretty sure no-one needs 256 spaces of indentation when
pretty-printing JSON values...
This is a backwards incompatible change.
json_int_t is typedef'd to long long if it's supported, or long
otherwise. There's also some supporting things, like the
JSON_INTEGER_FORMAT macro that expands to the printf() conversion
specifier that corresponds to json_int_t's actual type.
This is a backwards incompatible change.
Replace all occurences of unsigned int and unsigned long with size_t.
This is a backwards incompatible change, as the signature of many API
functions changes.
When encoding an array or object ends in an error, the visited flag
wasn't zeroed, causing subsequent encoding attempts to fail. This
patch fixes the problem by always zeroing the visited flag.
Encoding an empty array or object worked, but encoding it again
(possibly after adding some items) failed, because the visited flag
(used for detecting circular references) wasn't zeroed.
Initialize their reference counts to (unsigned int)-1 to disable
reference counting on them. It already was meant to work like this,
but the reference counts were just initialized to 1 instead of -1.
Thanks to Andrew Thompson for reporting this issue.
With this encoding flag, the object key-value pairs in output are in
the same order in which they were first inserted into the object.
To make this possible, a key of an object is now a serial number plus
a string. An object keeps an increasing counter which is used to
assign serial number to the keys. Hashing, comparison and public API
functions were changed to act only on the string part, i.e. the serial
number is ignored everywhere else but in the encoder, where it's used
to order object keys if JSON_PRESERVE_ORDER flag is used.
When the property already exists in the object, we can store an
iterator pointing to that property, instead of duplicating the key.
When the property (key) is not present in the object, we still have to
duplicate the key.
If a user happens to store an ElementProxy or a PropertyProxy
instance, we need to take a reference to the JSON value they point to.
With PropertyProxy, the key needs to be copied as well.
Added functions are:
* json_string_nocheck()
* json_string_set_nocheck()
* json_object_set_nocheck()
* json_object_set_new_nocheck()
These functions don't check that their string argument is valid UTF-8,
but assume that the user has already performed the check.
This patch changes the sprintf format from "%0.17f" to "%.17g", as the
f format specifier doesn't print the exponent at all. This caused
losing precision in all but the most simple cases.
Because the g specifier doesn't print the decimal fraction or exponent
if they're not needed, a ".0" has to be appended by hand in these
cases. Otherwise the value's type changes from real to integer when
decoding again.
Thanks to Philip Grandinetti for reporting this issue.
- Never append newline to output
- By default, add spaces between array and object items for more
readable output
- Introduce the flag JSON_COMPACT to not add the aforementioned spaces
Failing to do this has the effect that the error message is not
returned when the input file cannot be opened (e.g. if it doesn't
exist).
Thanks to Martin Vopatek for reporting.
It's now an error to try to add an object or array to itself. The
encoder checks for circular references and fails with an error status
if one is detected.
Added functions:
json_string_set
json_integer_set
json_real_set
While at it, clarify the documentation and parameter naming of
json_{string,integer,real}_value() a bit.
Some day we will have ANSI C compatibility... This change doesn't make
the API backwards incompatible because uint32_t was only used in flags
to json_dump*() and the flags are meant to be used only by ORing
constants and macro output, and actually currently only JSON_INDENT
can be used.
In stream_get(), EOF never got it to stream->buffer and because of
this, stream_unget() failed on some situations. This patch makes
stream_get() handle EOF just like any other byte.
As a "side effect", lex_scan_string() now needs to unget the EOF, or
otherwise it ends up in error message on premature end of input.
All pointer arguments are now tested for NULL. json_string() now also
tests that strdup() succeeds. This is to ensure that no NULL values
end up in data structures.
Also desribe the different sources of errors in documentation.
Don't alloca() a whitespace buffer and fill it with spaces in each
call to dump_indent. Instead, use a static whitespace buffer.
As a bonus, this saves the use of poorly portable alloca().
Before, only the syntax level (parse_*) was able to set the error
string. This patch fixes the situation so that lexical (lex_*) and
stream (stream_*) levels can report detailed error messages.
Also, instead of 0, EOF is now returned by stream on error.
It's no longer needed to load the whole input into a string and then
parse from the string. Instead, the input is read as needed from
a string or file.
Before, json_loads checked for '[' or '{' at the beginning. Now
there's a dedicated function for that: parse_json(). Also rename
parse() to parse_value().
Inside strings, All UTF-8 characters except for \, " and Unicode
control codes are dumped as-is. The control codes that have a special
one-character escape use that escape, and other control codes are
dumped using the \uXXXX escape.
Nothing was appended to strbuffer, so the buffer was left empty. An
empty strbuffer is not an empty string but NULL, so the result was a
segfault.
This patch fixes the problem by initializing strbuffer to an empty
string.
String buffer (strbuffer) is an object that resizes automatically when
data is added to it. It was implemented by generalizing the technique
used in json_dumps().