Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rbd: added rbd info to validateRBDImageCount func #4938

Open
wants to merge 2 commits into
base: devel
Choose a base branch
from

Conversation

OdedViner
Copy link

@OdedViner OdedViner commented Oct 31, 2024

Describe what this PR does

Added rbd info and rbd status on Failure message

bash-5.1$ rbd info --pool=replicapool csi-vol-2b6574ee-9c6f-4a3b-9a96-09060763f7a7
rbd image 'csi-vol-2b6574ee-9c6f-4a3b-9a96-09060763f7a7':
	size 1 GiB in 256 objects
	order 22 (4 MiB objects)
	snapshot_count: 0
	id: 1102e70da18f
	block_name_prefix: rbd_data.1102e70da18f
	format: 2
	features: layering
	op_features: 
	flags: 
	create_timestamp: Thu Oct 31 13:26:32 2024
	access_timestamp: Thu Oct 31 13:26:32 2024
	modify_timestamp: Thu Oct 31 13:26:32 2024
bash-5.1$ rbd status --pool=replicapool csi-vol-2b6574ee-9c6f-4a3b-9a96-09060763f7a7
Watchers: none

Is there anything that requires special attention

Do you have any questions?

Is the change backward compatible?

Are there concerns around backward compatibility?

Provide any external context for the change, if any.

For example:

  • Kubernetes links that explain why the change is required
  • CSI spec related changes/catch-up that necessitates this patch
  • golang related practices that necessitates this change

Related issues

Mention any github issues relevant to this PR. Adding below line
will help to auto close the issue once the PR is merged.

Fixes: #issue_number
#4547

Future concerns

List items that are not part of the PR and do not impact it's
functionality, but are work items that can be taken up subsequently.

Checklist:

  • Commit Message Formatting: Commit titles and messages follow
    guidelines in the developer
    guide
    .
  • Reviewed the developer guide on Submitting a Pull
    Request
  • Pending release
    notes

    updated with breaking and/or notable changes for the next major release.
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.

Show available bot commands

These commands are normally not required, but in case of issues, leave any of
the following bot commands in an otherwise empty comment in this PR:

  • /retest ci/centos/<job-name>: retest the <job-name> after unrelated
    failure (please report the failure too!)

@mergify mergify bot added the component/rbd Issues related to RBD label Oct 31, 2024
e2e/rbd.go Outdated
framework.Failf(
"backend images not matching kubernetes resource count,image count %d kubernetes resource count %d"+
"\nbackend image Info:\n %v",
"\nbackend image Info:\n %v\n images information and status%v",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

status%v could use a space befor the %v

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
err = json.Unmarshal([]byte(stdOut), &imgStatus)
if err != nil {
return imgStatus, fmt.Errorf("unmarshal failed: %w. raw buffer response: %s",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't return the buffer in the error message, as that may get printed many more times when callers log the error. Instead, use framework.Logf() to report details like this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e2e/rbd_helper.go:1125:3: k8s.io/kubernetes/test/e2e/framework.Logf does not support error-wrapping directive %w

Failed to run container: https://url.corp.redhat.com/cc13cdb

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

framework.Logf() does not support %w for logging errors. Use %v while logging, but when %w with fmt.Errorf() or errors.New().

fmt.Sprintf("rbd status %s %s --format json", rbdOptions(poolName), imageName),
rookNamespace)
if err != nil {
return imgStatus, fmt.Errorf("failed to get rbd status: %w", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It helps to give this err != nil and the below stdErr != "" errors a slightly different message. When something goes wrong, it can be identified more precisely if the error message is unique.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@nixpanic
Copy link
Member

nixpanic commented Nov 1, 2024

/test ci/centos/mini-e2e-helm/k8s-1.30

e2e/rbd.go Outdated
return
}
// Collecting image details for printing
imageDetails = append(imageDetails, fmt.Sprintf("Pool Name: %s, Image Name: %s, Img Info: %v, Img Status: %v", pool, image, imgInfo, imgStatus))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

e2e/rbd.go Outdated
for _, image := range imageList {
imgInfo, err := getImageInfo(f, image, pool)
if err != nil {
return
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to log error and continue as this is for debuggin

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

e2e/rbd.go Outdated
}
imgStatus, err := getImageStatus(f, image, pool)
if err != nil {
return
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to log error and continue as this is for debugging

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@OdedViner OdedViner force-pushed the add_rbd_info branch 2 times, most recently from 90b18cb to ffae514 Compare November 4, 2024 09:27
e2e/rbd_helper.go Outdated Show resolved Hide resolved
@Madhu-1
Copy link
Collaborator

Madhu-1 commented Nov 4, 2024

/test ci/centos/mini-e2e-helm/k8s-1.30

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Nov 4, 2024

/test ci/centos/mini-e2e-helm/k8s-1.30

@OdedViner
Copy link
Author

Output from failure job: https://url.corp.redhat.com/e9aa53d

[38;5;9m[FAILED] backend images not matching kubernetes resource count,image count 1 kubernetes resource count 11
  backend image Info:
   [csi-vol-d2bb1f89-26f7-4c37-a22d-d72f9a7aa30b]
   images information and status Pool: replicapool, Image: csi-vol-d2bb1f89-26f7-4c37-a22d-d72f9a7aa30b, Info: {csi-vol-d2bb1f89-26f7-4c37-a22d-d72f9a7aa30b 0 0 4194304}, Status: {  0 false 0 }�[0m
  �[38;5;9mIn �[1m[It]�[0m�[38;5;9m at: �[1m/go/src/github.com/ceph/ceph-csi/e2e/rbd.go:206�[0m �[38;5;243m@ 11/04/24 13:29:22.803�[0m

@Madhu-1 @nixpanic Is this the desired output? When I ran it manually I got the following results:

bash-5.1$ rbd info --pool=replicapool csi-vol-2b6574ee-9c6f-4a3b-9a96-09060763f7a7
rbd image 'csi-vol-2b6574ee-9c6f-4a3b-9a96-09060763f7a7':
	size 1 GiB in 256 objects
	order 22 (4 MiB objects)
	snapshot_count: 0
	id: 1102e70da18f
	block_name_prefix: rbd_data.1102e70da18f
	format: 2
	features: layering
	op_features: 
	flags: 
	create_timestamp: Thu Oct 31 13:26:32 2024
	access_timestamp: Thu Oct 31 13:26:32 2024
	modify_timestamp: Thu Oct 31 13:26:32 2024
bash-5.1$ rbd status --pool=replicapool csi-vol-2b6574ee-9c6f-4a3b-9a96-09060763f7a7
Watchers: none

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Nov 4, 2024

Output from failure job: https://url.corp.redhat.com/e9aa53d

[38;5;9m[FAILED] backend images not matching kubernetes resource count,image count 1 kubernetes resource count 11
  backend image Info:
   [csi-vol-d2bb1f89-26f7-4c37-a22d-d72f9a7aa30b]
   images information and status Pool: replicapool, Image: csi-vol-d2bb1f89-26f7-4c37-a22d-d72f9a7aa30b, Info: {csi-vol-d2bb1f89-26f7-4c37-a22d-d72f9a7aa30b 0 0 4194304}, Status: {  0 false 0 }�[0m
  �[38;5;9mIn �[1m[It]�[0m�[38;5;9m at: �[1m/go/src/github.com/ceph/ceph-csi/e2e/rbd.go:206�[0m �[38;5;243m@ 11/04/24 13:29:22.803�[0m

@Madhu-1 @nixpanic Is this the desired output? When I ran it manually I got the following results:

bash-5.1$ rbd info --pool=replicapool csi-vol-2b6574ee-9c6f-4a3b-9a96-09060763f7a7
rbd image 'csi-vol-2b6574ee-9c6f-4a3b-9a96-09060763f7a7':
	size 1 GiB in 256 objects
	order 22 (4 MiB objects)
	snapshot_count: 0
	id: 1102e70da18f
	block_name_prefix: rbd_data.1102e70da18f
	format: 2
	features: layering
	op_features: 
	flags: 
	create_timestamp: Thu Oct 31 13:26:32 2024
	access_timestamp: Thu Oct 31 13:26:32 2024
	modify_timestamp: Thu Oct 31 13:26:32 2024
bash-5.1$ rbd status --pool=replicapool csi-vol-2b6574ee-9c6f-4a3b-9a96-09060763f7a7
Watchers: none

How about if we just return the stdout and log it, i think it will be useful for debugging instead of unmarshal output. i tlooks like the above logs is not much helpful, @nixpanic WDYT?

@nixpanic
Copy link
Member

nixpanic commented Nov 6, 2024

How about if we just return the stdout and log it, i think it will be useful for debugging instead of unmarshal output. i tlooks like the above logs is not much helpful, @nixpanic WDYT?

Logging stdout is sufficient for me too. It may also be easier to find the right timestamp/location of errors when the logs contain pretty unique words like block_name_prefix.

@OdedViner
Copy link
Author

Output from failure job: https://url.corp.redhat.com/e9aa53d

[38;5;9m[FAILED] backend images not matching kubernetes resource count,image count 1 kubernetes resource count 11
  backend image Info:
   [csi-vol-d2bb1f89-26f7-4c37-a22d-d72f9a7aa30b]
   images information and status Pool: replicapool, Image: csi-vol-d2bb1f89-26f7-4c37-a22d-d72f9a7aa30b, Info: {csi-vol-d2bb1f89-26f7-4c37-a22d-d72f9a7aa30b 0 0 4194304}, Status: {  0 false 0 }�[0m
  �[38;5;9mIn �[1m[It]�[0m�[38;5;9m at: �[1m/go/src/github.com/ceph/ceph-csi/e2e/rbd.go:206�[0m �[38;5;243m@ 11/04/24 13:29:22.803�[0m

@Madhu-1 @nixpanic Is this the desired output? When I ran it manually I got the following results:

bash-5.1$ rbd info --pool=replicapool csi-vol-2b6574ee-9c6f-4a3b-9a96-09060763f7a7
rbd image 'csi-vol-2b6574ee-9c6f-4a3b-9a96-09060763f7a7':
	size 1 GiB in 256 objects
	order 22 (4 MiB objects)
	snapshot_count: 0
	id: 1102e70da18f
	block_name_prefix: rbd_data.1102e70da18f
	format: 2
	features: layering
	op_features: 
	flags: 
	create_timestamp: Thu Oct 31 13:26:32 2024
	access_timestamp: Thu Oct 31 13:26:32 2024
	modify_timestamp: Thu Oct 31 13:26:32 2024
bash-5.1$ rbd status --pool=replicapool csi-vol-2b6574ee-9c6f-4a3b-9a96-09060763f7a7
Watchers: none

How about if we just return the stdout and log it, i think it will be useful for debugging instead of unmarshal output. i tlooks like the above logs is not much helpful, @nixpanic WDYT?

I return imageInfo, string, error because validateStripe function

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Nov 6, 2024

Info: {csi-vol-d2bb1f89-26f7-4c37-a22d-d72f9a7aa30b 0 0 4194304}, Status: { 0 false 0 }

Info: {csi-vol-d2bb1f89-26f7-4c37-a22d-d72f9a7aa30b 0 0 4194304}, Status: { 0 false 0 } This should log both key and value from the struct without it its difficult to understand and debug it.

It would be good if we just return the stdout and log the stdout in the logs that helps for debugging, actual struct is missing few details.

@OdedViner
Copy link
Author

Info: {csi-vol-d2bb1f89-26f7-4c37-a22d-d72f9a7aa30b 0 0 4194304}, Status: { 0 false 0 } This should log both key and value from the struct without it its difficult to understand and debug it.

It would be good if we just return the stdout and log the stdout in the logs that helps for debugging, actual struct is missing few details.

images not matching kubernete

Info: {csi-vol-d2bb1f89-26f7-4c37-a22d-d72f9a7aa30b 0 0 4194304}, Status: { 0 false 0 }

Info: {csi-vol-d2bb1f89-26f7-4c37-a22d-d72f9a7aa30b 0 0 4194304}, Status: { 0 false 0 } This should log both key and value from the struct without it its difficult to understand and debug it.

It would be good if we just return the stdout and log the stdout in the logs that helps for debugging, actual struct is missing few details.

And what should be done with this function? It relies on the imgInfo struct.

ceph-csi/e2e/rbd_helper.go

Lines 1132 to 1133 in cea8bf8

imgInfo, err := getImageInfo(f, imageData.imageName, defaultRBDPool)
if err != nil {

@OdedViner
Copy link
Author

@Madhu-1 @nixpanic Which method would be better: getImageStatus, which returns a string output, or getImageInfo, which returns a struct with all parameters?

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Nov 11, 2024

@Madhu-1 @nixpanic Which method would be better: getImageStatus, which returns a string output, or getImageInfo, which returns a struct with all parameters?

we could have a base function that returns the string and getImageStatus can use the base function and later do unmarshal and from the validation we will call the base function that returns the string data and print it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/rbd Issues related to RBD
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants