webinfo -- Extract metadata and structured information from web pages
webinfo is a small Go module that extracts common metadata from web pages and provides utilities
to download representative images and create thumbnails.
- Package:
webinfo - Repository:
github.com/goark/webinfo - Purpose: fetch page metadata (title, description, canonical, image, etc.) and download images
- Fetch page metadata with
Fetch(handles encodings and meta tag precedence). - Download an image referenced by
Webinfo.ImageURLusing(*Webinfo).DownloadImage. - Create a thumbnail from the referenced image using
(*Webinfo).DownloadThumbnail.
Use Go modules (Go 1.25+ as used by the project):
go get github.com/goark/webinfo@latestExample showing fetch and download thumbnail (error handling omitted for brevity):
package main
import (
"context"
"fmt"
"github.com/goark/webinfo"
)
func main() {
ctx := context.Background()
// Fetch metadata for a page (empty UA uses default)
info, err := webinfo.Fetch(ctx, "https://text.baldanders.info/", "")
if err != nil {
fmt.Printf("error detail:\n%+v\n", err)
return
}
// Download thumbnail: width 150, to directory "thumbnails", permanent file
thumbPath, err := info.DownloadThumbnail(ctx, "thumbnails", 150, false)
if err != nil {
fmt.Printf("error detail:\n%+v\n", err)
return
}
fmt.Println("thumbnail saved:", thumbPath)
}Fetch(ctx, url, userAgent)— Parse and extract metadata. Pass an empty userAgent to use the module default.(*Webinfo).DownloadImage(ctx, destDir, temporary)— Download the image inWebinfo.ImageURLand save it. Iftemporaryis true (ordestDiris empty), a temporary file is created.(*Webinfo).DownloadThumbnail(ctx, destDir, width, temporary)— Download the referenced image and produce a thumbnail resized towidthpixels (height is preserved by aspect ratio). IfdestDiris empty the method creates a temporary file; whentemporaryis false the thumbnail file is named based on the original image name with-thumbappended before the extension.
Note on defaults and test hooks:
- Default width: If
width <= 0is passed toDownloadThumbnail, the method uses a default width of 150 pixels. - Extension detection:
DownloadImagedetermines an output extension from the URL path, the responseContent-Type(viamime.ExtensionsByType), or by sniffing up to the first 512 bytes withhttp.DetectContentType. - Test hooks / injection points: For easier testing the package exposes a few package-level variables that
tests can override:
createFile: used to create temporary or permanent files (wrapsos.CreateTemp/os.Create). Override to simulate file-creation failures.decodeImage: wrapper aroundimage.Decodeused byDownloadThumbnail— override to simulate decode results (for example, to return a zero-dimension image).outputImage: encoder that writes the thumbnail image to disk (wrapsjpeg.Encode,png.Encode, etc.). Override to simulate encoder failures.
These hooks are intended for tests and let callers reproduce rare I/O or encoding failures without changing production behavior.
- HTTP client timeout:
DownloadImageuses an HTTP client with a default 30-secondTimeoutfor the whole request; tests can override this by replacing thenewHTTPClientpackage variable.
Below are short examples showing how to override the package-level hooks from a test to simulate failures.
These snippets are intended for *_test.go files and assume the usual testing and net/http/httptest helpers.
- Simulate thumbnail temporary-file creation failure (override
createFile):
// in your test function
orig := createFile
defer func() { createFile = orig }()
createFile = func(temp bool, dir, pattern string) (*os.File, error) {
// fail only for thumbnail temp pattern
if temp && strings.Contains(pattern, "webinfo-thumb-") {
return nil, errors.New("simulated thumbnail temp create failure")
}
return orig(temp, dir, pattern)
}
// then call the method under test
_, err := info.DownloadThumbnail(ctx, t.TempDir(), 50, true)
// assert err != nil- Simulate a zero-dimension decoded image (override
decodeImage):
origDecode := decodeImage
defer func() { decodeImage = origDecode }()
decodeImage = func(r io.Reader) (image.Image, string, error) {
// return an image with zero width to hit the origW==0 error path
return image.NewRGBA(image.Rect(0, 0, 0, 10)), "png", nil
}
_, err := info.DownloadThumbnail(ctx, t.TempDir(), 50, true)
// assert err != nil- Simulate encoder failure when writing thumbnails (override
outputImage):
origOut := outputImage
defer func() { outputImage = origOut }()
outputImage = func(dst *os.File, src *image.RGBA, format string) error {
return errors.New("simulated encode failure")
}
_, err := info.DownloadThumbnail(ctx, t.TempDir(), 50, true)
// assert err != nilNotes:
- Ensure your test imports include
errors,io,image, andstringsas needed. - Restore the original variables with
deferto avoid cross-test interference. - These examples are intentionally minimal — adapt them to your test fixtures (httptest servers, temp dirs, etc.).
- Simulate HTTP client timeout by overriding
newHTTPClient:
origClient := newHTTPClient
defer func() { newHTTPClient = origClient }()
newHTTPClient = func() *http.Client {
// short timeout for test
return &http.Client{Timeout: 50 * time.Millisecond}
}
// then call DownloadImage which uses newHTTPClient()
_, err := info.DownloadImage(ctx, t.TempDir(), true)
// assert err != nil (expect timeout)The package uses github.com/goark/errs for wrapping errors with contextual keys (e.g. url, path, dir).
Callers should inspect returned errors accordingly.
- Run all tests:
go test ./... - The repository includes
Taskfile.ymltasks for common workflows; see that file for CI/test commands.
