Discussion:
[PATCH V1] hostdisk: Fix linux disk cache workaround on multipath disks
Michael Chang
2018-09-13 07:20:34 UTC
Permalink
In grub-core/osdep/linux/hostdisk.c::grub_util_fd_open_device() there's comment
about linux disk cache issue as below:

/* Linux has a bug that the disk cache for a whole disk is not consistent
with the one for a partition of the disk. */
{
....
}

As the input argument of grub_util_fd_open_device() is using address in unit of
sector size offset from the "disk", and in a bid to avoid Linux disk cache
inconsistency problem described by comment above, grub translates the address
again into the address offset from partition that has encompassed it, then use
that partition device in place of disk device.

The problem we encountered was that installing grub into multipath disk's
partition didn't work reliably. It boiled down to the disk cache problem
described above as strace result shown it was still using the whole disk
device, not the partition device we would expect.

This patch fixes the problem by adding the missing "/dev/dm-" name scheme
handling in grub_hostdisk_linux_find_partition(). After applying the patch
problem gets solved and we would like to have this fixing patch upstreamed as
it looks good material to be.

v1: Rework commit message.

Signed-off-by: Michael Chang <***@suse.com>
---
grub-core/osdep/linux/hostdisk.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/grub-core/osdep/linux/hostdisk.c b/grub-core/osdep/linux/hostdisk.c
index 06179fca7..ed530bdc4 100644
--- a/grub-core/osdep/linux/hostdisk.c
+++ b/grub-core/osdep/linux/hostdisk.c
@@ -263,6 +263,12 @@ grub_hostdisk_linux_find_partition (char *dev, grub_disk_addr_t sector)
p = real_dev + len;
format = "-part%d";
}
+ else if (strncmp (real_dev, "/dev/dm-",
+ sizeof ("/dev/dm-") - 1) == 0)
+ {
+ p = real_dev + len - 1;
+ format = "%d";
+ }
else if (real_dev[len - 1] >= '0' && real_dev[len - 1] <= '9')
{
p = real_dev + len;
--
2.13.6
Daniel Kiper
2018-09-19 13:28:34 UTC
Permalink
Post by Michael Chang
In grub-core/osdep/linux/hostdisk.c::grub_util_fd_open_device() there's comment
/* Linux has a bug that the disk cache for a whole disk is not consistent
with the one for a partition of the disk. */
{
....
}
As the input argument of grub_util_fd_open_device() is using address in unit of
sector size offset from the "disk", and in a bid to avoid Linux disk cache
inconsistency problem described by comment above, grub translates the address
again into the address offset from partition that has encompassed it, then use
that partition device in place of disk device.
The problem we encountered was that installing grub into multipath disk's
partition didn't work reliably. It boiled down to the disk cache problem
described above as strace result shown it was still using the whole disk
device, not the partition device we would expect.
This patch fixes the problem by adding the missing "/dev/dm-" name scheme
handling in grub_hostdisk_linux_find_partition(). After applying the patch
problem gets solved and we would like to have this fixing patch upstreamed as
it looks good material to be.
v1: Rework commit message.
Thanks! Right now it looks much better for me.
Post by Michael Chang
---
grub-core/osdep/linux/hostdisk.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/grub-core/osdep/linux/hostdisk.c b/grub-core/osdep/linux/hostdisk.c
index 06179fca7..ed530bdc4 100644
--- a/grub-core/osdep/linux/hostdisk.c
+++ b/grub-core/osdep/linux/hostdisk.c
@@ -263,6 +263,12 @@ grub_hostdisk_linux_find_partition (char *dev, grub_disk_addr_t sector)
p = real_dev + len;
format = "-part%d";
}
+ else if (strncmp (real_dev, "/dev/dm-",
+ sizeof ("/dev/dm-") - 1) == 0)
+ {
+ p = real_dev + len - 1;
+ format = "%d";
+ }
What will happen if the device path is /dev/dm-10?
Post by Michael Chang
else if (real_dev[len - 1] >= '0' && real_dev[len - 1] <= '9')
{
p = real_dev + len;
...and I am afraid that above line is buggy too...
What about the other cases?

Daniel
Michael Chang
2018-09-20 10:14:05 UTC
Permalink
Post by Daniel Kiper
Post by Michael Chang
In grub-core/osdep/linux/hostdisk.c::grub_util_fd_open_device() there's comment
/* Linux has a bug that the disk cache for a whole disk is not consistent
with the one for a partition of the disk. */
{
....
}
As the input argument of grub_util_fd_open_device() is using address in unit of
sector size offset from the "disk", and in a bid to avoid Linux disk cache
inconsistency problem described by comment above, grub translates the address
again into the address offset from partition that has encompassed it, then use
that partition device in place of disk device.
The problem we encountered was that installing grub into multipath disk's
partition didn't work reliably. It boiled down to the disk cache problem
described above as strace result shown it was still using the whole disk
device, not the partition device we would expect.
This patch fixes the problem by adding the missing "/dev/dm-" name scheme
handling in grub_hostdisk_linux_find_partition(). After applying the patch
problem gets solved and we would like to have this fixing patch upstreamed as
it looks good material to be.
v1: Rework commit message.
Thanks! Right now it looks much better for me.
Post by Michael Chang
---
grub-core/osdep/linux/hostdisk.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/grub-core/osdep/linux/hostdisk.c b/grub-core/osdep/linux/hostdisk.c
index 06179fca7..ed530bdc4 100644
--- a/grub-core/osdep/linux/hostdisk.c
+++ b/grub-core/osdep/linux/hostdisk.c
@@ -263,6 +263,12 @@ grub_hostdisk_linux_find_partition (char *dev, grub_disk_addr_t sector)
p = real_dev + len;
format = "-part%d";
}
+ else if (strncmp (real_dev, "/dev/dm-",
+ sizeof ("/dev/dm-") - 1) == 0)
+ {
+ p = real_dev + len - 1;
+ format = "%d";
+ }
What will happen if the device path is /dev/dm-10?
It depends on the number of consective failed attempt of open the device in
sequence order, starting from /dev/dm-1, before reaching /dev/dm-10. The number
is set to 10 and will be reset to 0 for any successful open .. It may or may
not give up /dev/dm-10 depending on how many consective failure encountered.
Post by Daniel Kiper
Post by Michael Chang
else if (real_dev[len - 1] >= '0' && real_dev[len - 1] <= '9')
{
p = real_dev + len;
...and I am afraid that above line is buggy too...
It seems to guess the partition device name from the disk device name matching
the pattern /dev/.*[0-9] to be /dev/.*[0-9]p[0-9]+, and is wrong with dm
partition name as something like /dev/dm-0p1.
Post by Daniel Kiper
What about the other cases?
I think the guess work is not good, but that is not the problem the patch
wanted to fix. It is aimed to fix a missing piece in current guess work.

Of course it didn't prevent us from coming up with another patch to improve the
guess work, or even get rid of it by preserving the input partition name
somewhere ...

Thanks,
Michael
Post by Daniel Kiper
Daniel
Loading...