Commit graph

12113 commits

Author SHA1 Message Date
hryx
2933a8241a json: disallow overlong and out-of-range UTF-8
Fixes #2379

= Overlong (non-shortest) sequences

UTF-8's unique encoding scheme allows for some Unicode codepoints
to be represented in multiple ways. For any of these characters,
the spec forbids all but the shortest form. These disallowed longer
sequences are called "overlong". As an interesting side effect of
this rule, the bytes C0 and C1 never appear in valid UTF-8.

= Codepoint range

UTF-8 disallows representation of codepoints beyond U+10FFFF,
which is the highest character which can be encoded in UTF-16.
Because a 4-byte sequence is capable of resulting in such characters,
they must be explicitly rejected. This rule also has an interesting
side effect, which is that bytes F5 to FF never appear.

= References

Detecting an overlong version of a codepoint could get gnarly, but
luckily The Unicode Consortium did the hard work by creating this
handy table of valid byte sequences:

https://unicode.org/versions/corrigendum1.html

I thought this mapped nicely to the parser's state machine, so I
rearranged the relevant states to make use of it.
2020-01-07 12:07:44 -05:00
Vexu
4184d4c66a
std-c parser record and enum specifiers 2020-01-07 19:05:46 +02:00
Vexu
df12c1328e
std-c parser typing improvements 2020-01-07 16:05:13 +02:00
Timon Kruiper
0deab8fd3b Add std.mem.zeroes to the standard library
This zero initializes the type passed in. Can be used to zero
initialize c structs.
2020-01-06 19:24:17 -05:00
LemonBoy
e3a63b4e5a Add more compiler-rt functions for ARM platform 2020-01-06 19:08:15 -05:00
Andrew Kelley
d3d77138ec
remove redundant license file 2020-01-06 19:05:42 -05:00
Andrew Kelley
633b6bf920
Merge branch 'LemonBoy-cc-work' 2020-01-06 18:53:17 -05:00
Andrew Kelley
53913acaf7
zig fmt and update extern fn to callconv(.C) 2020-01-06 15:34:50 -05:00
Andrew Kelley
5951b79af4
remove stdcallcc, extern, nakedcc from stage1; zig fmt rewrites 2020-01-06 15:23:05 -05:00
Andrew Kelley
0a9daeb37e
Merge branch 'cc-work' of https://github.com/LemonBoy/zig into LemonBoy-cc-work 2020-01-06 14:07:56 -05:00
Colin Svingen
4e6ad8efd9 Removes proc_raise from WASI implementation 2020-01-06 14:04:55 -05:00
xackus
6bebf741f9 json: implement copy_strings=false 2020-01-06 19:59:54 +01:00
Vexu
3ed6d7d245
std-c parser declarator 2020-01-06 20:06:17 +02:00
Vexu
d5d52af26e
std-c parse pointer 2020-01-06 00:06:33 +02:00
Andrew Kelley
fee9318b17
std.os.getrusage: add C extern fn and reserved field
* add reserved field to match musl struct definition so that
   it will work with musl libc.
 * add libc getrusage so that it will work with libc

what's not done in this branch is:
 * test coverage. See #1629, which should also aim to provide
   general test coverage for the std lib.
 * rusage struct bits for non-linux operating systems
2020-01-05 16:57:14 -05:00
data-man
2f6b045fb1
Add std.os.getrusage 2020-01-05 16:52:36 -05:00
Andrew Kelley
a0ca34979e
Merge pull request #4053 from ziglang/test-run-translated-c
add test harness for "run translated C" tests
2020-01-05 14:50:02 -05:00
Vexu
5feeff7123
std-c improve error reporting and decl parsing 2020-01-05 20:25:52 +02:00
Vexu
795a503999
std-c tokenizer always add newline token 2020-01-05 20:25:51 +02:00
Vexu
f934f9b419
std-c parser fndef and static assert 2020-01-05 20:25:51 +02:00
Vexu
46f292982d
std-c parser DeclSpec 2020-01-05 20:25:51 +02:00
Vexu
25f7f66b8f
std-c type parsing 2020-01-05 20:25:51 +02:00
Vexu
dccf1247b2
std-c ifstmt compoundstmt and errors 2020-01-05 20:25:51 +02:00
Vexu
a20c0b31de
std-c parser and ast organization 2020-01-05 20:25:51 +02:00
Vexu
73a53fa263
std-c outline parser 2020-01-05 20:25:50 +02:00
Vexu
e1b01d32f0
std-c ast base 2020-01-05 20:25:50 +02:00
Vexu
2183c4bb44
std-c tokenizer string concatenation 2020-01-05 20:25:50 +02:00
Vexu
a5d1fb1e49
std-c tokenizer line continuation, tests and fixes 2020-01-05 20:25:50 +02:00
Vexu
c221593d7d
std-c tokenizer better special case handling 2020-01-05 20:25:50 +02:00
Vexu
472ca947c9
std-c tokenizer add tests 2020-01-05 20:25:50 +02:00
Vexu
d75697a6a3
std-c tokenizer keywords 2020-01-05 20:25:50 +02:00
Vexu
26bf410b06
std-c finish tokenizer 2020-01-05 20:25:49 +02:00
Vexu
f14a5287e9
std-c tokenizer strings, floats and comments 2020-01-05 20:25:49 +02:00
Vexu
05acc0b0c1
std-c tokenizer more stuff 2020-01-05 20:25:49 +02:00
Vexu
04b7cec42e
std-c tokenizer base 2020-01-05 20:25:49 +02:00
Andrew Kelley
242f5d10d5
fix test-gen-h and test-compile-errors regression 2020-01-05 13:08:18 -05:00
Haze Booth
2e5342512f remove @TypeOf() hacks for comptime_int/comptime_float 2020-01-05 02:33:23 -05:00
Andrew Kelley
a690a5085d
rework and improve some of the zig build steps
* `RunStep` moved to lib/std/build/run.zig and gains ability to compare
   output and exit code against expected values. Multiple redundant
   locations in the test harness code are replaced to use `RunStep`.
 * `WriteFileStep` moved to lib/std/build/write_file.zig and gains
   ability to write more than one file into the cache directory, for
   when the files need to be relative to each other. This makes
   usage of `WriteFileStep` no longer problematic when parallelizing
   zig build.
 * Added `CheckFileStep`, which can be used to validate that the output
   of another step produced a valid file. Multiple redundant locations
   in the test harness code are replaced to use `CheckFileStep`.
 * Added `TranslateCStep`. This exposes `zig translate-c` to the build
   system, which is likely to be rarely useful by most Zig users;
   however Zig's own test suite uses it both for translate-c tests and
   for run-translated-c tests.
 * Refactored ad-hoc code to handle source files coming from multiple
   kinds of sources, into `std.build.FileSource`.
 * Added `std.build.Builder.addExecutableFromWriteFileStep`.
 * Added `std.build.Builder.addExecutableSource`.
 * Added `std.build.Builder.addWriteFiles`.
 * Added `std.build.Builder.addTranslateC`.
 * Added `std.build.LibExeObjStep.addCSourceFileSource`.
 * Added `std.build.LibExeObjStep.addAssemblyFileFromWriteFileStep`.
 * Added `std.build.LibExeObjStep.addAssemblyFileSource`.
 * Exposed `std.fs.base64_encoder`.
2020-01-05 02:19:22 -05:00
Andrew Kelley
6ea193946d
Merge pull request #3950 from nmichaels/master
Document std.Mutex.
2020-01-03 20:05:03 -05:00
LemonBoy
e6485282d3 Better logic for last-param rendering 2020-01-03 11:49:42 +01:00
LemonBoy
7b375a1c4a Revert "Revert "Trailing comma is respected for builtin calls""
This reverts commit f83411b0b1.
2020-01-03 10:17:40 +01:00
Andrew Kelley
f83411b0b1
Revert "Trailing comma is respected for builtin calls"
This reverts commit afd0290918.

This caused test failures.
2020-01-02 21:53:25 -05:00
LemonBoy
afd0290918 Trailing comma is respected for builtin calls 2020-01-02 16:43:39 -05:00
LemonBoy
e99209baf0 Add transform test 2020-01-02 18:57:08 +01:00
LemonBoy
0ccac79c8e Implement Thiscall CC 2020-01-02 18:57:08 +01:00
LemonBoy
0ec64d4c0c Integrate callconv into translate-c-2 2020-01-02 18:53:21 +01:00
LemonBoy
563d9ebfe5 Implement the callconv() annotation 2020-01-02 18:53:16 +01:00
Andrew Kelley
cb56b26900
fix float ops with respect to vectors
also remove the redundant type parameter
2020-01-01 23:27:43 -05:00
Andrew Kelley
576320e6d5
Merge pull request #4025 from ziglang/Vexu-stage-2-cimport
Use self hosted translate-c for cImport
2020-01-01 22:46:46 -05:00
Andrew Kelley
5575e2a168
std.mem.compare: breaking API changes
* `std.mem.Compare` is now `std.math.Order` and the enum tags
   renamed to follow new style convention.
 * `std.mem.compare` is renamed to `std.mem.order`.
 * new function `std.math.order`
2020-01-01 18:08:40 -05:00