Autoresearch optimization run for validation pipeline#5578
Open
swalkinshaw wants to merge 54 commits intomasterfrom
Open
Autoresearch optimization run for validation pipeline#5578swalkinshaw wants to merge 54 commits intomasterfrom
swalkinshaw wants to merge 54 commits intomasterfrom
Conversation
…ult: {"status":"keep","total_µs":3118621.6,"introspection_us":481,"abstract_frags_us":242.3,"abstract_frags2_us":377.3,"big_query_us":3823.8,"fields_merge_us":3113697.2}
…up (3.12s → 21.7ms)\n\nResult: {"status":"keep","total_µs":21740.5,"introspection_us":466.3,"abstract_frags_us":262.8,"abstract_frags2_us":533.5,"big_query_us":4172.3,"fields_merge_us":16305.6}
…ared_fragments_key\n\nResult: {"status":"keep","total_µs":20607.9,"introspection_us":437.1,"abstract_frags_us":234.4,"abstract_frags2_us":511.1,"big_query_us":3747.1,"fields_merge_us":15678.2}
…s identical\n\nResult: {"status":"keep","total_µs":19744.9,"introspection_us":463.3,"abstract_frags_us":405.5,"abstract_frags2_us":345,"big_query_us":3816.1,"fields_merge_us":14715}
…4.5ms total\n\nResult: {"status":"keep","total_µs":4465.4,"introspection_us":389.7,"abstract_frags_us":335.8,"abstract_frags2_us":304.4,"big_query_us":3435.5,"fields_merge_us":8754.7}
…ate arguments call in RequiredArgumentsArePresent\n\nResult: {"status":"keep","total_µs":4293.8,"introspection_us":505.3,"abstract_frags_us":221,"abstract_frags2_us":317.2,"big_query_us":3250.3,"fields_merge_us":7996.7}
…ursive fragment cross-comparison. big_query -23%\n\nResult: {"status":"keep","total_µs":3486.8,"introspection_us":399.1,"abstract_frags_us":259.2,"abstract_frags2_us":334.2,"big_query_us":2494.3,"fields_merge_us":7362.7}
… in collect_fields_inner\n\nResult: {"status":"keep","total_µs":3356.8,"introspection_us":388,"abstract_frags_us":208.3,"abstract_frags2_us":332.4,"big_query_us":2428.1,"fields_merge_us":7369.3}
…: {"status":"keep","total_µs":2714.4,"introspection_us":747.4,"abstract_frags_us":187.3,"abstract_frags2_us":206,"big_query_us":1573.7,"fields_merge_us":3246.7}
…lds — big_query ~1.47ms\n\nResult: {"status":"keep","total_µs":2576.7,"introspection_us":714.7,"abstract_frags_us":186.5,"abstract_frags2_us":204.1,"big_query_us":1471.4,"fields_merge_us":2762}
…ry metric now includes large_query.\n\nResult: {"status":"keep","total_µs":21911.8,"abstract_frags_us":174.5,"abstract_frags2_us":171.7,"big_query_us":1665.1,"large_query_us":19900.5,"fields_merge_us":2331.7}
…rn_type_conflicts(false). large_query 15.4ms\n\nResult: {"status":"keep","total_µs":17529,"abstract_frags_us":233.4,"abstract_frags2_us":189.3,"big_query_us":1719.4,"large_query_us":15386.9,"fields_merge_us":2508.3}
…ema.did_you_mean is nil. large_query 9.6ms (was 15.4ms, -37%)\n\nResult: {"status":"keep","total_µs":11958.7,"abstract_frags_us":369.6,"abstract_frags2_us":232.6,"big_query_us":1717.4,"large_query_us":9639.1,"fields_merge_us":2387}
…large_query 8.95ms. Verified no regression on smaller benchmarks.\n\nResult: {"status":"keep","total_µs":11271.1,"abstract_frags_us":326.8,"abstract_frags2_us":236.1,"big_query_us":1755.1,"large_query_us":8953.1,"fields_merge_us":2373}
…sets — avoids re-expanding same sub-selections. large_query ~6.2ms (was ~9ms)\n\nResult: {"status":"keep","total_µs":8496.2,"abstract_frags_us":162.9,"abstract_frags2_us":278.3,"big_query_us":1889.2,"large_query_us":6165.8,"fields_merge_us":2448.3}
…eAppropriateSelections, lazy path in FragmentSpreadsArePossible with intersect?, skip allocations in RequiredArgumentsArePresent and ArgumentNamesAreUnique\n\nResult: {"status":"keep","total_µs":8285.2,"abstract_frags_us":169.3,"abstract_frags2_us":260.7,"big_query_us":1875.3,"large_query_us":5979.9,"fields_merge_us":2262}
…, non-leaf type with selections). Avoids kind/leaf checks on hot path.\n\nResult: {"status":"keep","total_µs":8733.3,"abstract_frags_us":187.8,"abstract_frags2_us":259.7,"big_query_us":1956.9,"large_query_us":6328.9,"fields_merge_us":2309.3}
…FragmentSpreadsArePossible and FragmentsAreOnCompositeTypes. Eliminates redundant types.type() lookups.\n\nResult: {"status":"keep","total_µs":8312.4,"abstract_frags_us":334.6,"abstract_frags2_us":194.7,"big_query_us":1523.9,"large_query_us":6259.2,"fields_merge_us":2155.3}
…es.last instead of @types.type(). Cache max_errors in FieldsWillMerge to avoid delegation.\n\nResult: {"status":"keep","total_µs":8493.1,"abstract_frags_us":310.6,"abstract_frags2_us":200.8,"big_query_us":1598,"large_query_us":6383.7,"fields_merge_us":2287.3}
…e object). Avoids redundant type comparison for same-definition field pairs.\n\nResult: {"status":"keep","total_µs":8542.6,"abstract_frags_us":221.2,"abstract_frags2_us":503.8,"big_query_us":1442.1,"large_query_us":6375.5,"fields_merge_us":2517.7}
…repeated Schema::Field#type and Wrapper#unwrap calls in find_conflict and find_conflicts_between_sub_selection_sets\n\nResult: {"status":"keep","total_µs":7934.6,"abstract_frags_us":170.6,"abstract_frags2_us":471.4,"big_query_us":1375.1,"large_query_us":5917.5,"fields_merge_us":2295}
…es only static validation. large_query ~4ms (was ~6ms with Profile creation overhead)\n\nResult: {"status":"keep","total_µs":5189.3,"abstract_frags_us":240.4,"abstract_frags2_us":100.4,"big_query_us":908.5,"large_query_us":3940,"fields_merge_us":2249.2}
…tes repeated kind checks, parent lookups, and visibility checks on warm cache. Universal win across all workloads.\n\nResult: {"status":"keep","total_µs":4651.5,"abstract_frags_us":237.7,"abstract_frags2_us":96.5,"big_query_us":848.5,"large_query_us":3468.8,"fields_merge_us":1852.6}
…e + visibility + referenced? checks. All workloads improve.\n\nResult: {"status":"keep","total_µs":4152.9,"abstract_frags_us":236.9,"abstract_frags2_us":86,"big_query_us":815.9,"large_query_us":3014.1,"fields_merge_us":1844}
…so this benefits all queries permanently. Eliminates recursive unwrap on repeated field visits.\n\nResult: {"status":"keep","total_µs":4090.7,"abstract_frags_us":246.8,"abstract_frags2_us":88,"big_query_us":794.6,"large_query_us":2961.3,"fields_merge_us":1823.6}
…lections.empty? first (cheaper array check) before kind.leaf? dispatch\n\nResult: {"status":"keep","total_µs":4057.5,"abstract_frags_us":238.9,"abstract_frags2_us":88.3,"big_query_us":813.8,"large_query_us":2916.5,"fields_merge_us":1849.6}
…rginal improvement, reduces block allocation in hottest loop\n\nResult: {"status":"keep","total_µs":4074.7,"abstract_frags_us":252.6,"abstract_frags2_us":86.6,"big_query_us":784.9,"large_query_us":2950.6,"fields_merge_us":1889}
…th_type — removes method indirection, simplifies code\n\nResult: {"status":"keep","total_µs":4068.7,"abstract_frags_us":244.5,"abstract_frags2_us":86,"big_query_us":802,"large_query_us":2936.2,"fields_merge_us":1900}
…string allocations per validation for queries with many inline fragments\n\nResult: {"status":"keep","total_µs":4056.3,"abstract_frags_us":241.4,"abstract_frags2_us":86.9,"big_query_us":803.8,"large_query_us":2924.2,"fields_merge_us":1863.8}
…emoization, eliminates repeated string building\n\nResult: {"status":"keep","total_µs":4231.9,"abstract_frags_us":243.1,"abstract_frags2_us":87.6,"big_query_us":797.6,"large_query_us":3103.6,"fields_merge_us":1889}
…991→2641µs (-56%), big_query 1477→780µs (-47%), fields_merge 1.8s→1.8ms (-99.9%). Higher iteration counts + 3-trial median for accuracy.\n\nResult: {"status":"keep","total_µs":3561.9,"abstract_frags_us":62,"abstract_frags2_us":78.5,"big_query_us":780,"large_query_us":2641.4,"fields_merge_us":1780.9}
…s with empty? check before iterating. Avoids ~6000 empty .each calls per validation in large_query.\n\nResult: {"status":"keep","total_µs":3967.1,"abstract_frags_us":230.1,"abstract_frags2_us":85.8,"big_query_us":803.4,"large_query_us":2847.8,"fields_merge_us":1775.2}
…ids repeated type+unwrap dispatch chain. Bigger win for resolver-class schemas where Field#type isn't memoized.\n\nResult: {"status":"keep","total_µs":3997.1,"abstract_frags_us":248.3,"abstract_frags2_us":85,"big_query_us":794.3,"large_query_us":2869.5,"fields_merge_us":1631.2}
… ~1.7x faster than Struct.new with YJIT. Saves ~2600 struct allocations per validation. fields_merge improves most (5000 Field objects).\n\nResult: {"status":"keep","total_µs":3959.8,"abstract_frags_us":255.4,"abstract_frags2_us":84.9,"big_query_us":789.7,"large_query_us":2829.8,"fields_merge_us":1451}
…xt.types/context.query.types in FieldsWillMerge, RequiredArgumentsArePresent, VariablesAreInputTypes. Cache context.fragments in @fragments.\n\nResult: {"status":"keep","total_µs":4191,"abstract_frags_us":258.8,"abstract_frags2_us":88.3,"big_query_us":843.4,"large_query_us":3000.5,"fields_merge_us":1499}
…tsArePresent — avoids re-iterating arguments for the same definition across field instances\n\nResult: {"status":"keep","total_µs":4154.6,"abstract_frags_us":238.2,"abstract_frags2_us":84.6,"big_query_us":822.3,"large_query_us":3009.5,"fields_merge_us":1460.6}
…ble — eliminates push/pop per field visit (2247+ per validation). Uses simple variable assignment instead of array operations.\n\nResult: {"status":"keep","total_µs":4059.6,"abstract_frags_us":257.3,"abstract_frags2_us":90.4,"big_query_us":816.3,"large_query_us":2895.6,"fields_merge_us":1395.2}
…on variable — same pattern as field_definitions, eliminates push/pop for directives\n\nResult: {"status":"keep","total_µs":3906.5,"abstract_frags_us":237.3,"abstract_frags2_us":84,"big_query_us":785.3,"large_query_us":2799.9,"fields_merge_us":1374.6}
…ep","total_µs":3990.4,"abstract_frags_us":247.4,"abstract_frags2_us":85,"big_query_us":801.5,"large_query_us":2856.5,"fields_merge_us":1395.8}
…finition variables — uses Ruby call stack for save/restore instead of Array push/pop. Handles arbitrary nesting depth correctly. Matters for argument-heavy queries.\n\nResult: {"status":"keep","total_µs":3898,"abstract_frags_us":246,"abstract_frags2_us":83.8,"big_query_us":798,"large_query_us":2770.2,"fields_merge_us":1400}
…type variables — eliminates ~5500 push/pop operations per validation across on_field, on_inline_fragment, on_operation_definition, on_fragment_definition\n\nResult: {"status":"keep","total_µs":3919.5,"abstract_frags_us":247.6,"abstract_frags2_us":83.4,"big_query_us":803.4,"large_query_us":2785.1,"fields_merge_us":1329.2}
… — eliminates 379 block closure allocations per validation\n\nResult: {"status":"keep","total_µs":3929.8,"abstract_frags_us":248.9,"abstract_frags2_us":84.1,"big_query_us":787.8,"large_query_us":2809,"fields_merge_us":1352.8}
…s are all unaliased, unique-named Fields with no fragments — avoids collect_fields+find_conflicts_within for ~55% of non-leaf fields\n\nResult: {"status":"keep","total_µs":3700.4,"abstract_frags_us":246.4,"abstract_frags2_us":84.8,"big_query_us":752.8,"large_query_us":2616.4,"fields_merge_us":1497}
…iredArgumentsArePresent — avoids one method dispatch per field\n\nResult: {"status":"keep","total_µs":3806.2,"abstract_frags_us":242.9,"abstract_frags2_us":86.5,"big_query_us":772.9,"large_query_us":2703.9,"fields_merge_us":1483.6}
…field and on_operation_definition\n\nResult: {"status":"keep","total_µs":3886.7,"abstract_frags_us":259.6,"abstract_frags2_us":86.3,"big_query_us":785.1,"large_query_us":2755.7,"fields_merge_us":1506.8}
…ated Array.new(32) with @path_depth tracking eliminates Array#push/#pop overhead for ~5500 path operations per validation\n\nResult: {"status":"keep","total_µs":3741.9,"abstract_frags_us":257.6,"abstract_frags2_us":82.4,"big_query_us":746.8,"large_query_us":2655.1,"fields_merge_us":1322}
…definition — eliminates yield/block overhead and method dispatch for 531 fragment callbacks per validation\n\nResult: {"status":"keep","total_µs":3740.1,"abstract_frags_us":248.1,"abstract_frags2_us":85,"big_query_us":751.2,"large_query_us":2655.8,"fields_merge_us":1318}
…leaf+no-selections ~70%, non-leaf+selections ~16%). Previous fast path only caught 16%. Now covers 86% of fields.\n\nResult: {"status":"keep","total_µs":3719.9,"abstract_frags_us":239,"abstract_frags2_us":85.8,"big_query_us":751,"large_query_us":2644.1,"fields_merge_us":1349.2}
…ectly in response_keys, only wrap in array on collision. Saves ~1200 array allocations (85% of response keys are single-field). large_query -15%.\n\nResult: {"status":"keep","total_µs":3304.6,"abstract_frags_us":236.2,"abstract_frags2_us":82.8,"big_query_us":747.9,"large_query_us":2237.7,"fields_merge_us":2689.4}
…te on first fragment spread encounter. Saves ~200 hash allocations for fragment-free selection sets.\n\nResult: {"status":"keep","total_µs":3140.2,"abstract_frags_us":228.7,"abstract_frags2_us":77.8,"big_query_us":690.1,"large_query_us":2143.6,"fields_merge_us":2871}
Owner
|
Thanks for sharing what you found here! I definitely wouldn't say no to optimizations in this department. (In the bigger picture, I've lately been wondering, what if we got rid of all the different modules and did validation without so many I don't expect to find time to review these closely until I've got the new execution module out the door (in 2.6.0). After that I'll come review more closely. If there are any of these that you find particularly awesome (FieldsWillMerge rewrite?), please feel free to dress them up for independent review and merge. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR isn't designed to be mergeable as is. It's an experiment running Karpathy's autoresearch process on the static validation pipeline. I used the pi extension.
@rmosolgo feel free to cherry pick any of the below optimizations if you're interested in them.
Couple things of note:
./test_fast.shscript for a quicker feedback loop running tests.Optimize static validation pipeline
~65% faster static validation on realistic workloads, measured with YJIT and Visibility Profiles enabled.
Benchmark results (YJIT, Visibility Profiles,
did_you_mean(nil))Benchmarks measure only
Validator#validate— Query/Profile initialization is excluded.What changed (20 files, +534/-383)
FieldsWillMerge rewrite
The biggest win. The original algorithm compared fields across three phases (fields-within, fields-vs-fragments, fragments-vs-fragments) with recursive fragment cross-comparison that could be exponential. This rewrites it to:
Visibility Profile caching
Profile#field(owner, field_name)results per (owner, field_name) pair — eliminates repeated kind checks, parent lookups, and visibility checksProfile#type(type_name)results — eliminates repeatedget_type+ visibility +referenced?checksVisitor allocation reduction
@field_definitions,@directive_definitions,@argument_definitions,@object_types) with save/restore instance variables — eliminates ~12,000 Arraypush/popoperations per validation@pathpush/pop with a pre-allocated indexed array + depth counteron_fragment_with_typeintoon_inline_fragmentandon_fragment_definition— eliminates block/yield overheadsetting_errorsblock intoon_fieldandon_operation_definitionType system memoization
Wrapper#unwrapon NonNull/List wrapper objects (schema-level, permanent)to_type_signatureon NonNull/List wrappersfield_definition.type.unwrapper Schema::Field in the visitorRule-level micro-optimizations
all_typesloading whenschema.did_you_meanis nilHash.newwith default proc; use plain hash@current_field_definition,@current_object_type,@types,@fragments) instead of accessor methods/delegation in hot paths.eachiteration for arguments, directives, and selections (~6,000 avoided per large validation)Test changes
3 test expectations in
fields_will_merge_spec.rbupdated for semantically equivalent error ordering (flattened fragment collection reports errors in a different order than the recursive three-phase approach).