|
| 1 | +// |
| 2 | +// The functionality of the standard library is becoming increasingly |
| 3 | +// important in Zig. On the one hand, it is helpful to look at how |
| 4 | +// the individual functions are implemented. Because this is wonderfully |
| 5 | +// suitable as a template for your own functions. On the other hand, |
| 6 | +// these standard functions are part of the basic equipment of Zig. |
| 7 | +// |
| 8 | +// This means that they are always available on every system. |
| 9 | +// Therefore it is worthwhile to deal with them also in Ziglings. |
| 10 | +// It's a great way to learn important skills. For example, it is |
| 11 | +// often necessary to process large amounts of data from files. |
| 12 | +// And for this sequential reading and processing, Zig provides some |
| 13 | +// useful functions, which we will take a closer look at in the coming |
| 14 | +// exercises. |
| 15 | +// |
| 16 | +// A nice example of this has been published on the Zig homepage, |
| 17 | +// replacing the somewhat dusty 'Hello world! |
| 18 | +// |
| 19 | +// Nothing against 'Hello world!', but it just doesn't do justice |
| 20 | +// to the elegance of Zig and that's a pity, if someone takes a short, |
| 21 | +// first look at the homepage and doesn't get 'enchanted'. And for that |
| 22 | +// the present example is simply better suited and we will therefore |
| 23 | +// use it as an introduction to tokenizing, because it is wonderfully |
| 24 | +// suited to understand the basic principles. |
| 25 | +// |
| 26 | +// In the following exercises we will also read and process data from |
| 27 | +// large files and at the latest then it will be clear to everyone how |
| 28 | +// useful all this is. |
| 29 | +// |
| 30 | +// Let's start with the analysis of the example from the Zig homepage |
| 31 | +// and explain the most important things. |
| 32 | +// |
| 33 | +// const std = @import("std"); |
| 34 | +// |
| 35 | +// // Here a function from the Standard library is defined, |
| 36 | +// // which transfers numbers from a string into the respective |
| 37 | +// // integer values. |
| 38 | +// const parseInt = std.fmt.parseInt; |
| 39 | +// |
| 40 | +// // Defining a test case |
| 41 | +// test "parse integers" { |
| 42 | +// |
| 43 | +// // Four numbers are passed in a string. |
| 44 | +// // Please note that the individual values are separated |
| 45 | +// // either by a space or a comma. |
| 46 | +// const input = "123 67 89,99"; |
| 47 | +// |
| 48 | +// // In order to be able to process the input values, |
| 49 | +// // memory is required. An allocator is defined here for |
| 50 | +// // this purpose. |
| 51 | +// const ally = std.testing.allocator; |
| 52 | +// |
| 53 | +// // The allocator is used to initialize an array into which |
| 54 | +// // the numbers are stored. |
| 55 | +// var list = std.ArrayList(u32).init(ally); |
| 56 | +// |
| 57 | +// // This way you can never forget what is urgently needed |
| 58 | +// // and the compiler doesn't grumble either. |
| 59 | +// defer list.deinit(); |
| 60 | +// |
| 61 | +// // Now it gets exciting: |
| 62 | +// // A standard tokenizer is called (Zig has several) and |
| 63 | +// // used to locate the positions of the respective separators |
| 64 | +// // (we remember, space and comma) and pass them to an iterator. |
| 65 | +// var it = std.mem.tokenize(u8, input, " ,"); |
| 66 | +// |
| 67 | +// // The iterator can now be processed in a loop and the |
| 68 | +// // individual numbers can be transferred. |
| 69 | +// while (it.next()) |num| { |
| 70 | +// // But be careful: The numbers are still only available |
| 71 | +// // as strings. This is where the integer parser comes |
| 72 | +// // into play, converting them into real integer values. |
| 73 | +// const n = try parseInt(u32, num, 10); |
| 74 | +// |
| 75 | +// // Finally the individual values are stored in the array. |
| 76 | +// try list.append(n); |
| 77 | +// } |
| 78 | +// |
| 79 | +// // For the subsequent test, a second static array is created, |
| 80 | +// // which is directly filled with the expected values. |
| 81 | +// const expected = [_]u32{ 123, 67, 89, 99 }; |
| 82 | +// |
| 83 | +// // Now the numbers converted from the string can be compared |
| 84 | +// // with the expected ones, so that the test is completed |
| 85 | +// // successfully. |
| 86 | +// for (expected, list.items) |exp, actual| { |
| 87 | +// try std.testing.expectEqual(exp, actual); |
| 88 | +// } |
| 89 | +// } |
| 90 | +// |
| 91 | +// So much for the example from the homepage. |
| 92 | +// Let's summarize the basic steps again: |
| 93 | +// |
| 94 | +// - We have a set of data in sequential order, separated from each other |
| 95 | +// by means of various characters. |
| 96 | +// |
| 97 | +// - For further processing, for example in an array, this data must be |
| 98 | +// read in, separated and, if necessary, converted into the target format. |
| 99 | +// |
| 100 | +// - We need a buffer that is large enough to hold the data. |
| 101 | +// |
| 102 | +// - This buffer can be created either statically at compile time, if the |
| 103 | +// amount of data is already known, or dynamically at runtime by using |
| 104 | +// a memory allocator. |
| 105 | +// |
| 106 | +// - The data are divided by means of Tokenizer at the respective |
| 107 | +// separators and stored in the reserved memory. This usually also |
| 108 | +// includes conversion to the target format. |
| 109 | +// |
| 110 | +// - Now the data can be conveniently processed further in the correct format. |
| 111 | +// |
| 112 | +// These steps are basically always the same. |
| 113 | +// Whether the data is read from a file or entered by the user via the |
| 114 | +// keyboard, for example, is irrelevant. Only subtleties are distinguished |
| 115 | +// and that's why Zig has different tokenizers. But more about this in |
| 116 | +// later exercises. |
| 117 | +// |
| 118 | +// Now we also want to write a small program to tokenize some data, |
| 119 | +// after all we need some practice. Suppose we want to count the words |
| 120 | +// of this little poem: |
| 121 | +// |
| 122 | +// My name is Ozymandias, King of Kings; |
| 123 | +// Look on my Works, ye Mighty, and despair! |
| 124 | +// by Percy Bysshe Shelley |
| 125 | +// |
| 126 | +// |
| 127 | +const std = @import("std"); |
| 128 | +const print = std.debug.print; |
| 129 | + |
| 130 | +pub fn main() !void { |
| 131 | + |
| 132 | + // our input |
| 133 | + const poem = |
| 134 | + \\My name is Ozymandias, King of Kings; |
| 135 | + \\Look on my Works, ye Mighty, and despair! |
| 136 | + ; |
| 137 | + |
| 138 | + // now the tokenizer, but what do we need here? |
| 139 | + var it = std.mem.tokenize(u8, poem, ???); |
| 140 | + |
| 141 | + // print all words and count them |
| 142 | + var cnt: usize = 0; |
| 143 | + while (it.next()) |word| { |
| 144 | + cnt += 1; |
| 145 | + print("{s}\n", .{word}); |
| 146 | + } |
| 147 | + |
| 148 | + // print the result |
| 149 | + print("This little poem has {d} words!\n", .{cnt}); |
| 150 | +} |
0 commit comments