Skip to content

Conversation

@vinit-chauhan
Copy link

@vinit-chauhan vinit-chauhan commented Oct 27, 2025

Elastic-package has been extensively used for integrations related tasks. However, most of the commands now are targeted to be run on one package at a time. We don’t have an option if we want to repeat a certain operation across multiple integrations.

This pull request adds two subcommands to the elastic-package to allow bulk operations from the elastic-package.

  1. find
  2. foreach

Note: Both commands are expected to be run from the integration repository.

Find subcommand

Find command adds the ability to filter and return a list of integrations based on specified criteria.

Available Filters:

  • Category (--categories)
  • Codeowner (--code-owners)
  • Input (--inputs)
  • Package Dir Name (--package-dirs) (supports glob patterns)
  • Package Name (--packages) (supports glob patterns)
  • Package Type (--package-types) ( eg., integration, input )
  • Spec Version (--spec-version)

You can chain multiple filters and each filter can have multiple comma-separated values.
Matching:
All filters must match.
At least one of the values match.

Currently, the find command runs sequentially and reads all the manifest files at once and keeps it in memory. We can update the code to use a buffered channel and have a producer and consumer to reduce memory footprint.

The current elastic-package and spec does not enforce the same package_name and directory name in the repo. Which leads to some integrations having different package_name and directory name.

The find command by default returns the absolute path of the package in the integration repo. However, it also provides a flag --output-info to make it return the package name or directory name.

Moreover, it also provides a flag --output to select the output format json, yaml or new line separated paths (default).

If no filter flag is provided, the command will return a list of all the integrations.

elastic-package find --inputs tcp,udp --code-owners elastic/integration-experience --packages cisco_*

Foreach subcommand

The foreach command leverages the filter registry. Therefore all the flags available in the filter are directly available to foreach commands without any code changes.

Additionally, foreach has 1 flag --parallel which allows the user to run commands parallelly using worker pool. (Will be added in a follow-up PR)

default is 1 ( runs sequentially )

The elastic-package command you want to run goes after -- with all of its flags.

Note: You are only allowed to run allowed elastic-package subcommands ["build","check","changelog","clean","format","install","lint","test","uninstall"] (cmd/foreach.go)

elastic-package foreach --inputs tcp,udp --code-owners elastic/integration-experience -- test system -g

File changes:

internal/packages/packages.go: Added function to find the integrations repo root dir and read all manifests.
cmd/find.go: Find command implementation
cmd/foreach.go: Foreach command implementation
internal/filter/*: Filter interface and implementation for each filter flag.

Related Issues

AI Tools used

  • Cursor With Claude-4.5-Sonnet
  • Antigravity With Gemini-3-Pro

@jrmolin
Copy link

jrmolin commented Nov 10, 2025

are you going to pull this out of draft?

@jsoriano jsoriano mentioned this pull request Nov 11, 2025
@vinit-chauhan vinit-chauhan force-pushed the implement-filter-registry branch from 61bba87 to 689eb72 Compare November 12, 2025 17:49
@vinit-chauhan vinit-chauhan marked this pull request as ready for review November 12, 2025 19:17
@vinit-chauhan
Copy link
Author

Removed code for parallel execution from the PR for now.
Once the required changes in elastic-package merged will raise a new PR.

@mjwolf
Copy link
Contributor

mjwolf commented Nov 12, 2025

It could be nice to add some additional output formats to the filter command. For example, one result per line output would make working with shell scripts easier. But, it's also easy to convert the json output to do this with jq, so adding multiple formats might not be critical to add.

// Validate it's a package manifest
ok, err := isPackageManifest(path)
if err != nil {
// Log the error but continue searching
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't actually log the error here. You should either remove the comment or add logging

Copy link

@jrmolin jrmolin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a startling lack of testing, but it looks reasonable otherwise

FilterOutputFlagShorthand = "o"

FilterPackageDirNameFlagName = "package-dirs"
FilterPackageDirNameFlagDescription = "package directories to filter by (comma-separated values)"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this flag expected to work? If I use --package-dirs ./test/packages/parallel in the elastic-package repository it doesn't find any package.

$ elastic-package filter --package-dirs ./test/packages/parallel
2025/11/12 21:20:31  INFO Found 0 matching package(s)
null

Copy link
Author

@vinit-chauhan vinit-chauhan Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

package-dirs would filter packages based on the name of the package's directory. This is used to search without needing to know the package name in the manifest file.

example:

 go run main.go -C ../integrations filter --package-dirs sql_input -o pkgname
2025/11/13 09:29:56  INFO Found 1 matching package(s)
["sql"]

if you want to filter all packages in one directory you can use the following command.

❯ go run main.go -C test/packages/parallel filter
2025/11/13 09:11:51  INFO Found 19 matching package(s)
["apache","apache_basic_license","auditd_manager","auth0_logsdb","aws","awsfirehose","custom_entrypoint","httpcheck","mongodb","nginx","nginx_multiple_services","oracle","otel_http_server","sql_input","system","terraform_local","ti_anomali","ti_anomali_logsdb","ti_anomali_template"]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ok. So it is not possible to search in multiple directories. I guess this is fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it make sense to add the example to the description so it goes to the docs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying again the command today and I struggled again to properly find packages per directory.

I find also a bit confusing to have to use -C for this, whose purpose is to change directory to do the operation, and in the case of this command limits the option to look for packages in multiple directories.

What is the use case for filtering packages per dir name with --package-dirs?

I think it would be nice to have an explicit flag to indicate where to look for packages, though at this point it would likely add more confusion.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By Default, it looks for package in current dir + 2 depth.

So you can either increase --depth/-d you can look through more directories or run the command in the directory you want to search in.

CWD: Users/vinit.chauhan/github.com/elastic/elastic-package
❯ elastic-package filter -d 4 --output-info absolute_path
2025/11/26 10:44:41  INFO Found 65 matching package(s)
/Users/vinit.chauhan/github.com/elastic/elastic-package/internal/fields/testdata
/Users/vinit.chauhan/github.com/elastic/elastic-package/internal/files/testdata/links
/Users/vinit.chauhan/github.com/elastic/elastic-package/internal/files/testdata/testpackage
/Users/vinit.chauhan/github.com/elastic/elastic-package/new_package
/Users/vinit.chauhan/github.com/elastic/elastic-package/test/new_package2
/Users/vinit.chauhan/github.com/elastic/elastic-package/test/packages/benchmarks/pipeline_benchmark
/Users/vinit.chauhan/github.com/elastic/elastic-package/test/packages/benchmarks/rally_benchmark
cd test/packages/parallel 
❯ elastic-package filter -d 4 --output-info absolute_path -v
2025/11/26 10:58:43  INFO Found 19 matching package(s)
/Users/vinit.chauhan/github.com/elastic/elastic-package/test/packages/parallel/apache
/Users/vinit.chauhan/github.com/elastic/elastic-package/test/packages/parallel/apache_basic_license
/Users/vinit.chauhan/github.com/elastic/elastic-package/test/packages/parallel/auditd_manager
/Users/vinit.chauhan/github.com/elastic/elastic-package/test/packages/parallel/auth0_logsdb

The only reason to have a flag for filtering package_directories is because package spec allows different package_name and directory name of the package.

User might want to search all Cisco packages, they can do it using following command.

CWD: .../integrations/
❯ elastic-package filter --package-dirs 'cisco*'
2025/11/26 11:08:44  INFO Found 12 matching package(s)
cisco_aironet
cisco_asa
cisco_duo
cisco_ftd
cisco_ios
cisco_ise
cisco_meraki
cisco_meraki_metrics
cisco_nexus
cisco_secure_email_gateway
cisco_secure_endpoint
cisco_umbrella

errors := multierror.Error{}

for _, pkg := range filtered {
rootCmd := cmd.Root()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could define another root command with the specific commands that are supported and use it here, then we could have different sets of subcommands, and maybe we could even plug it directly as subcommands of foreach.

This could be left for a future refactor too.

Something like this:

import (
        "github.com/spf13/cobra"

        "github.com/elastic/elastic-package/internal/cobraext"
)

var forEachCommands = []*cobraext.Command{
        setupBuildCommand(),
        setupCheckCommand(),
        setupCleanCommand(),
        setupFormatCommand(),
        setupInstallCommand(),
        setupLintCommand(),
        setupTestCommand(),
        setupUninstallCommand(),
}

// ForEachRootCmd creates and returns root cmd for elastic-package
func ForEachRootCmd() *cobra.Command {
        forEachCmd := &cobra.Command{
                SilenceUsage: true,
                PersistentPreRunE: func(cmd *cobra.Command, args []string) error {
                        return cobraext.ComposeCommandActions(cmd, args,
                                processPersistentFlags,
                                checkVersionUpdate,
                        )
                },
        }
        forEachCmd.PersistentFlags().CountP(cobraext.VerboseFlagName, cobraext.VerboseFlagShorthand, cobraext.VerboseFlagDescription)
        forEachCmd.PersistentFlags().StringP(cobraext.ChangeDirectoryFlagName, cobraext.ChangeDirectoryFlagShorthand, "", cobraext.ChangeDirectoryFlagDescription)

        for _, cmd := range forEachCommands {
                forEachCmd.AddCommand(cmd.Command)
        }
        return forEachCmd
}

That could be used here like this:

Suggested change
rootCmd := cmd.Root()
rootCmd := ForEachRootCmd()
rootCmd.SetContext(cmd.Context())

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something that we need to do to add parallel execution. That's why I left it out of this PR, and would like to revisit it while I work on the other PR.

With that said, we do have a list of allowed sub-commands. Here

Let me know if I missed any command. - I'll also add changelog command in allow list.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can leave this for a future refactor.

I'll also add changelog command in allow list.

Yes please.

@jsoriano
Copy link
Member

It could be nice to add some additional output formats to the filter command. For example, one result per line output would make working with shell scripts easier.

I would also prefer to have this format by default, and make json an optional alternative format.

- add build to exclude list
- default output to nsv. also allow json and yaml
@vinit-chauhan
Copy link
Author

Updated the code to support multiple output formats.

- Newline separated list ( default )
- JSON
- YAML 

@teresaromero
Copy link
Contributor

@vinit-chauhan could we add some unit testing for the filters logic? or perhaps a test for the new cmd in order to validate the output of the command with some test packages?

Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say to merge this and continue in follow ups, but some of the questions are related to the overall UX, that if we change later we could broke things.

So maybe we could just add some warnings, merge it, and polish this later if needed. WDYT?

FilterOutputFlagShorthand = "o"

FilterPackageDirNameFlagName = "package-dirs"
FilterPackageDirNameFlagDescription = "package directories to filter by (comma-separated values)"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying again the command today and I struggled again to properly find packages per directory.

I find also a bit confusing to have to use -C for this, whose purpose is to change directory to do the operation, and in the case of this command limits the option to look for packages in multiple directories.

What is the use case for filtering packages per dir name with --package-dirs?

I think it would be nice to have an explicit flag to indicate where to look for packages, though at this point it would likely add more confusion.


FilterOutputInfoFlagName = "output-info"
FilterOutputInfoFlagDescription = "output information about the packages. Available options: pkgname, dirname, absolute"
FilterOutputInfoFlagDefault = "dirname"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder what default is more useful, maybe the absolute path is better as it can be directly used without making any assumption?

dirname can be strange because there could be multiple of them with the same value if they are under different subdirectories.


Use --change-directory to change the working directory and --depth to change the depth of the search.

### `elastic-package foreach [flags] -- <SUBCOMMAND>`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should list somewhere what commands are available for foreach.

README.md Outdated

Use this command to download selected ingest pipelines and its referenced processor pipelines from Elasticsearch. Select data stream or the package root directories to download the pipelines. Pipelines are downloaded as is and will need adjustment to meet your package needs.

### `elastic-package filter [flags]`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This command is looking for packages more than filtering them. Should it be called elastic-package find?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, Now that I think of this, naming it find makes more sense.

FilterOutputFlagShorthand = "o"

FilterOutputInfoFlagName = "output-info"
FilterOutputInfoFlagDescription = "output information about the packages. Available options: pkgname, dirname, absolute"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we think on supporting multiple values at some point? We can add this later, but we should decide now if the output format should be different. Specially for JSON and YAML, we could list objects instead of plain values in preparation for including more data in the future.

[{"dirname":"apache"},{"dirname":"apache_tomcat"}]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we could definitely include that in future if we find any use case for that.

}

func (o *OutputOptions) validate() error {
validInfo := []string{"pkgname", "dirname", "absolute"}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. As this is exposed as user options I would use name or package_name for the package name, and maybe absolute_path instead of absolute.

Suggested change
validInfo := []string{"pkgname", "dirname", "absolute"}
validInfo := []string{"package_name", "dir_name", "absolute_path"}

cmd/filter.go Outdated
return cobraext.NewCommand(cmd, cobraext.ContextPackage)
}

func filterCommandAction(cmd *cobra.Command, args []string) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could log that these new commands are in technical preview, so we can more freely make breaking changes in follow ups.

Copy link
Author

@vinit-chauhan vinit-chauhan Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added [Technical Preview] in both short and long descriptions.

Do you have any preference on where we should put the note?

@vinit-chauhan
Copy link
Author

I would say to merge this and continue in follow ups, but some of the questions are related to the overall UX, that if we change later we could broke things.

So maybe we could just add some warnings, merge it, and polish this later if needed. WDYT?

Yeah, That makes sense. I'll update the code with all the review comments you made. Would you mind creating an issue where we can track the changes we need for the future?

@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

History

cc @vinit-chauhan

@vinit-chauhan vinit-chauhan changed the title Add filter and foreach sub commands for bulk operations Add find and foreach sub commands for bulk operations Nov 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add subcommand for bulk actions

6 participants