Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always use optimal encoding function #512

Merged

Conversation

ypconstante
Copy link
Contributor

Floki today has two similar encoding functions, Floki.Entities.encode and Floki.RawHTML.html_escape.
This PR moves the html_escape code to the Entities module, ensuring we always use the fastest encoding function available.

##### With input big #####
Name                    ips        average  deviation         median         99th %
bench                 50.48       19.81 ms     ±8.35%       19.23 ms       26.89 ms
bench (today)         25.47       39.26 ms    ±10.21%       38.05 ms       55.82 ms

Comparison: 
bench                 50.48
bench (today)         25.47 - 1.98x slower +19.45 ms

Memory usage statistics:

Name             Memory usage
bench                 9.49 MB
bench (today)         9.23 MB - 0.97x memory usage -0.25281 MB

**All measurements for memory usage were the same**

##### With input medium #####
Name                    ips        average  deviation         median         99th %
bench                161.10        6.21 ms     ±9.48%        6.06 ms        9.19 ms
bench (today)         98.20       10.18 ms     ±7.52%        9.96 ms       13.76 ms

Comparison: 
bench                161.10
bench (today)         98.20 - 1.64x slower +3.98 ms

Memory usage statistics:

Name             Memory usage
bench                 3.31 MB
bench (today)         3.29 MB - 1.00x memory usage -0.01551 MB

**All measurements for memory usage were the same**

##### With input small #####
Name                    ips        average  deviation         median         99th %
bench                805.16        1.24 ms    ±10.37%        1.24 ms        1.73 ms
bench (today)        263.75        3.79 ms    ±31.24%        4.31 ms        5.99 ms

Comparison: 
bench                805.16
bench (today)        263.75 - 3.05x slower +2.55 ms

Memory usage statistics:

Name             Memory usage
bench               720.79 KB
bench (today)       710.66 KB - 0.99x memory usage -10.13281 KB
read_file = fn name ->
  __ENV__.file
  |> Path.dirname()
  |> Path.join(name)
  |> File.read!()
  |> Floki.parse_document!()
end

inputs = %{
  "big" => read_file.("big.html"),
  "medium" => read_file.("medium.html"),
  "small" => read_file.("small.html")
}

Benchee.run(
  %{
    "bench" => &Floki.raw_html/1
  },
  time: 10,
  inputs: inputs,
  memory_time: 2
)

Copy link
Owner

@philss philss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@philss philss merged commit 539d76a into philss:main Dec 28, 2023
9 checks passed
@philss
Copy link
Owner

philss commented Dec 28, 2023

Nice catch! Thank you!

@ypconstante ypconstante deleted the always-use-optimal-encoding-function branch December 28, 2023 01:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants