Discussion:
[PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.
(too old to reply)
Daniel Kiper
2018-06-14 11:17:53 UTC
Permalink
---
grub-core/fs/btrfs.c | 70 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 70 insertions(+)
diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index be195448d..4d418859b 100644
--- a/grub-core/fs/btrfs.c
+++ b/grub-core/fs/btrfs.c
@@ -119,6 +119,8 @@ struct grub_btrfs_chunk_item
#define GRUB_BTRFS_CHUNK_TYPE_RAID1 0x10
#define GRUB_BTRFS_CHUNK_TYPE_DUPLICATED 0x20
#define GRUB_BTRFS_CHUNK_TYPE_RAID10 0x40
+#define GRUB_BTRFS_CHUNK_TYPE_RAID5 0x80
+#define GRUB_BTRFS_CHUNK_TYPE_RAID6 0x100
grub_uint8_t dummy2[0xc];
grub_uint16_t nstripes;
grub_uint16_t nsubstripes;
@@ -764,6 +766,74 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
stripe_offset = low + chunk_stripe_length
* high;
csize = chunk_stripe_length - low;
+ break;
+ }
+ {
+ grub_uint64_t nparities, stripe_nr, high, low;
+
+ redundancy = 1; /* no redundancy for now */
+
+ if (grub_le_to_cpu64 (chunk->type) & GRUB_BTRFS_CHUNK_TYPE_RAID5)
+ {
+ grub_dprintf ("btrfs", "RAID5\n");
+ nparities = 1;
+ }
+ else
+ {
+ grub_dprintf ("btrfs", "RAID6\n");
+ nparities = 2;
+ }
+
+ /*
+ * Below is an example of a RAID 6 layout and the meaning of the
+ * variables. The same applies to RAID 5. The only differences is
+ * that there is only one parity disk instead of two.
+ *
+ * A RAID 6 layout consists of several stripes spread
+ * on the disks, following a layout like the one below
+ *
+ * Disk1 Disk2 Disk3 Ddisk4
Numbering seems confusing to me. I think that it should be
Disk0 Disk1 Disk2 Disk3
+ *
+ * A1 B1 P1 Q1
+ * Q2 A2 B2 P2
+ * P3 Q3 A3 B3
+ * [...]
+ *
+ * Note that the placement of the parities depends on row index.
+ * - stripe_nr is the stripe number not considering the parities
+ * (A1=0, B1=1, A2 = 2, B2 = 3, ...),
Please be consistent. A1 = 0, B1 = 1, A2 = 2, B2 = 3, ...
+ * - high is the row number (0 for A1...Q1, 1 for Q2..P2, ...),
Ditto. Please always use "..." not "..".
+ * - stripen is the column number (or disk number),
AIUI starting from 0. Right? If yes then I think that
drawing above requires disks/columns renumbering.
+ * - off is the logical address to read (from the beginning of
+ * the chunk space),
s/chunk space/chunk/?
+ * - chunk_stripe_length is the size of a stripe (typically 64k),
+ * - nstripes is the number of disks,
+ * - low is the offset of the data inside a stripe,
+ * - stripe_offset is the offset from the beginning of the chunk
+ * disks physical address,
I am not sure that I understand. Could clarify this?
+ * - csize is the "potential" data to read. It will be reduced to
+ * size if the latter is smaller.
+ */
+ stripe_nr = grub_divmod64 (off, chunk_stripe_length, &low);
OK.
+ /*
+ * stripen is evaluated without considering
+ * the parities (0 for A1, A2, A3... 1 for B1, B2...).
+ */
+ high = grub_divmod64 (stripe_nr, nstripes - nparities, &stripen);
OK.
+ /*
+ * stripen now considers also the parities (0 for A1, 1 for A2,
+ * 2 for A3....). The math is performed modulo number of disks.
+ */
+ grub_divmod64 (high + stripen, nstripes, &stripen);
OK.
+ stripe_offset = low + chunk_stripe_length * high;
Hmmm... I am confused. What does it mean?

Daniel
Daniel Kiper
2018-06-14 11:25:09 UTC
Permalink
This helper will be used in a few places to help the debugging. As
conservative approach, in case of error it is only logged. This is
s/, in case of error it/ the error/
because I am not sure if this can change something in the error
handling of the currently existing code.
s/code/code or not/

Otherwise LGTM.

Daniel
Daniel Kiper
2018-06-14 11:28:20 UTC
Permalink
This is a preparatory patch. The caller knows better if this
error is fatal or not, i.e. another disk is available or not.
Please make first sentence last one in separate line.
The same applies to other patches.

Otherwise LGTM.

Daniel
Daniel Kiper
2018-06-14 11:52:18 UTC
Permalink
A portion of the logging code is moved outside of internal for(;;). The part
that is left inside is the one which depends by the internal for(;;) index.
s/depends by/depends on/
This is a preparatory patch: in the next one it will be possible to refactor
s/: in the next one it will be possible to/. The next one will/
the code inside the for(;;) in an another function.
s/in/into/
---
grub-core/fs/btrfs.c | 25 ++++++++++++++-----------
1 file changed, 14 insertions(+), 11 deletions(-)
diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index b64b692f8..e2baed665 100644
--- a/grub-core/fs/btrfs.c
+++ b/grub-core/fs/btrfs.c
@@ -867,6 +867,18 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
for (j = 0; j < 2; j++)
{
+ grub_dprintf ("btrfs", "chunk 0x%" PRIxGRUB_UINT64_T
+ "+0x%" PRIxGRUB_UINT64_T
+ " (%d stripes (%d substripes) of %"
+ PRIxGRUB_UINT64_T ") \n",
s/") \n"/")\n"/

Otherwise LGTM.

Daniel
Daniel Kiper
2018-06-14 12:38:15 UTC
Permalink
Move the code in charge to read the data from disk in a separate
s/in a/into a/
function. This helps to separate the error handling logic (which depend by
s/depend by/depends on/ Please fix this here and in other patches too.
the different raid profiles) from the read from disk logic.
Refactoring this code increases the general readability too.
This is a preparatory patch, to help the adding of the RAID 5/6 recovery
code.
---
grub-core/fs/btrfs.c | 76 ++++++++++++++++++++++++++------------------
1 file changed, 45 insertions(+), 31 deletions(-)
diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index e2baed665..9cdbfe792 100644
--- a/grub-core/fs/btrfs.c
+++ b/grub-core/fs/btrfs.c
@@ -625,6 +625,47 @@ find_device (struct grub_btrfs_data *data, grub_uint64_t id)
return ctx.dev_found;
}
+static grub_err_t
+btrfs_read_from_chunk (struct grub_btrfs_data *data,
+ struct grub_btrfs_chunk_item *chunk,
+ grub_uint64_t stripen, grub_uint64_t stripe_offset,
+ int redundancy, grub_uint64_t csize,
+ void *buf)
+{
+
Please drop this empty line.
+ struct grub_btrfs_chunk_stripe *stripe;
+ grub_disk_addr_t paddr;
+ grub_device_t dev;
+ grub_err_t err;
+
+ stripe = (struct grub_btrfs_chunk_stripe *) (chunk + 1);
+ /* Right now the redundancy handling is easy.
+ With RAID5-like it will be more difficult. */
+ stripe += stripen + redundancy;
+
+ paddr = grub_le_to_cpu64 (stripe->offset) + stripe_offset;
+
+ grub_dprintf ("btrfs", "stripe %" PRIxGRUB_UINT64_T
+ " maps to 0x%" PRIxGRUB_UINT64_T "\n",
+ stripen, stripe->offset);
+ grub_dprintf ("btrfs", "reading paddr 0x%" PRIxGRUB_UINT64_T "\n", paddr);
Could you put this into one grub_dprintf() call?
+
+ dev = find_device (data, stripe->device_id);
+ if (!dev)
+ {
+ grub_dprintf ("btrfs",
+ "couldn't find a necessary member device "
+ "of multi-device filesystem\n");
+ grub_errno = GRUB_ERR_NONE;
+ return GRUB_ERR_READ_ERROR;
Why grub_errno = GRUB_ERR_NONE and then return GRUB_ERR_READ_ERROR?
If you do things like that I think that you should add a few words
of comment before such code. Otherwise it is confusing.

Daniel
Daniel Kiper
2018-06-14 13:03:53 UTC
Permalink
Add support for recovery fo a RAID 5 btrfs profile. In addition
s/fo /for /
it is added some code as preparatory work for RAID 6 recovery code.
---
grub-core/fs/btrfs.c | 180 +++++++++++++++++++++++++++++++++++++++++--
1 file changed, 175 insertions(+), 5 deletions(-)
diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index 9cdbfe792..c8f034641 100644
--- a/grub-core/fs/btrfs.c
+++ b/grub-core/fs/btrfs.c
@@ -29,6 +29,7 @@
#include <minilzo.h>
#include <grub/i18n.h>
#include <grub/btrfs.h>
+#include <grub/crypto.h>
GRUB_MOD_LICENSE ("GPLv3+");
@@ -666,6 +667,157 @@ btrfs_read_from_chunk (struct grub_btrfs_data *data,
return err;
}
+struct raid56_buffer {
+ void *buf;
+ int data_is_valid;
+};
+
+static void
+rebuild_raid5 (char *dest, struct raid56_buffer *buffers,
+ grub_uint64_t nstripes, grub_uint64_t csize)
+{
+ grub_uint64_t i;
grub_uint64_t i = 0;
int first = 1;
+ int first;
+
+ i = 0;
Then you can drop this assignment.
+ while (buffers[i].data_is_valid && i < nstripes)
+ ++i;
+
+ if (i == nstripes)
+ {
+ grub_dprintf ("btrfs", "called rebuild_raid5(), but all disks are OK\n");
+ return;
+ }
+
+ grub_dprintf ("btrfs", "rebuilding RAID 5 stripe #%" PRIuGRUB_UINT64_T "\n",
+ i);
This can be in one line.
+ first = 1;
And you can drop this assignment too.
+ for (i = 0; i < nstripes; i++)
+ {
+ if (!buffers[i].data_is_valid)
+ continue;
+
+ if (first)
+ grub_memcpy(dest, buffers[i].buf, csize);
+ else
+ grub_crypto_xor (dest, dest, buffers[i].buf, csize);
I am not sure why at first you use grub_memcpy() and
then move to grub_crypto_xor(). Could you explain this?
Why do not use grub_crypto_xor() in all cases?
+
+ first = 0;
+ }
+}
+
+static grub_err_t
+raid56_read_retry (struct grub_btrfs_data *data,
+ struct grub_btrfs_chunk_item *chunk,
+ grub_uint64_t stripe_offset,
+ grub_uint64_t csize, void *buf)
+{
+
Please drop this empty line. I have asked about that earlier.
+ struct raid56_buffer *buffers = NULL;
+ grub_uint64_t nstripes = grub_le_to_cpu16 (chunk->nstripes);
+ grub_uint64_t chunk_type = grub_le_to_cpu64 (chunk->type);
+ grub_err_t ret = GRUB_ERR_NONE;
+ grub_uint64_t i, failed_devices;
+
+ buffers = grub_zalloc (sizeof(*buffers) * nstripes);
How often this function is called? Maybe you should consider
doing memory allocation for this function only once and free
it at btrfs module unload.
+ if (!buffers)
+ {
+ ret = GRUB_ERR_OUT_OF_MEMORY;
+ goto cleanup;
+ }
+
+ for (i = 0; i < nstripes; i++)
+ {
+ buffers[i].buf = grub_zalloc (csize);
Ditto.
+ if (!buffers[i].buf)
+ {
+ ret = GRUB_ERR_OUT_OF_MEMORY;
+ goto cleanup;
+ }
+ }
+
+ for (i = 0; i < nstripes; i++)
+ {
+ struct grub_btrfs_chunk_stripe *stripe;
+ grub_disk_addr_t paddr;
+ grub_device_t dev;
+ grub_err_t err2;
+
+ stripe = (struct grub_btrfs_chunk_stripe *) (chunk + 1);
+ stripe += i;
+
+ paddr = grub_le_to_cpu64 (stripe->offset) + stripe_offset;
+ grub_dprintf ("btrfs", "reading paddr %" PRIxGRUB_UINT64_T
+ " from stripe ID %" PRIxGRUB_UINT64_T "\n", paddr,
+ stripe->device_id);
+
+ dev = find_device (data, stripe->device_id);
+ if (!dev)
+ {
+ buffers[i].data_is_valid = 0;
+ grub_dprintf ("btrfs", "stripe %" PRIuGRUB_UINT64_T " FAILED (dev ID %"
+ PRIxGRUB_UINT64_T ")\n", i, stripe->device_id);
+ continue;
+ }
+
+ err2 = grub_disk_read (dev->disk, paddr >> GRUB_DISK_SECTOR_BITS,
+ paddr & (GRUB_DISK_SECTOR_SIZE - 1),
+ csize, buffers[i].buf);
+ if (err2 == GRUB_ERR_NONE)
+ {
+ buffers[i].data_is_valid = 1;
+ grub_dprintf ("btrfs", "stripe %" PRIuGRUB_UINT64_T " Ok (dev ID %"
+ PRIxGRUB_UINT64_T ")\n", i, stripe->device_id);
+ }
+ else
+ {
+ buffers[i].data_is_valid = 0;
+ grub_dprintf ("btrfs", "stripe %" PRIuGRUB_UINT64_T
+ " FAILED (dev ID %" PRIxGRUB_UINT64_T ")\n", i,
+ stripe->device_id);
+ }
+ }
+
+ failed_devices = 0;
+ for (i = 0; i < nstripes; i++)
for (failed_devices = i = 0; i < nstripes; i++)
+ if (!buffers[i].data_is_valid)
+ ++failed_devices;
+ if (failed_devices > 1 && (chunk_type & GRUB_BTRFS_CHUNK_TYPE_RAID5))
+ {
+ grub_dprintf ("btrfs",
+ "not enough disks for RAID 5: total %" PRIuGRUB_UINT64_T
+ ", missing %" PRIuGRUB_UINT64_T "\n",
+ nstripes, failed_devices);
+ ret = GRUB_ERR_READ_ERROR;
+ goto cleanup;
+ }
+ else
+ {
+ grub_dprintf ("btrfs",
+ "enough disks for RAID 5 rebuilding: total %"
+ PRIuGRUB_UINT64_T ", missing %" PRIuGRUB_UINT64_T "\n",
+ nstripes, failed_devices);
+ }
+
+ /* if these are enough, try to rebuild the data */
+ if (chunk_type & GRUB_BTRFS_CHUNK_TYPE_RAID5)
+ rebuild_raid5 (buf, buffers, nstripes, csize);
+ else
+ grub_dprintf ("btrfs", "called rebuild_raid6(), NOT IMPLEMENTED\n");
+
Space before the label please.
I have asked about earlier.
+
Please drop this empty line.

Daniel
Goffredo Baroncelli
2018-06-14 19:05:26 UTC
Permalink
Post by Daniel Kiper
Add support for recovery fo a RAID 5 btrfs profile. In addition
s/fo /for /
it is added some code as preparatory work for RAID 6 recovery code.
---
grub-core/fs/btrfs.c | 180 +++++++++++++++++++++++++++++++++++++++++--
1 file changed, 175 insertions(+), 5 deletions(-)
diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index 9cdbfe792..c8f034641 100644
--- a/grub-core/fs/btrfs.c
+++ b/grub-core/fs/btrfs.c
@@ -29,6 +29,7 @@
#include <minilzo.h>
#include <grub/i18n.h>
#include <grub/btrfs.h>
+#include <grub/crypto.h>
GRUB_MOD_LICENSE ("GPLv3+");
@@ -666,6 +667,157 @@ btrfs_read_from_chunk (struct grub_btrfs_data *data,
return err;
}
+struct raid56_buffer {
+ void *buf;
+ int data_is_valid;
+};
+
+static void
+rebuild_raid5 (char *dest, struct raid56_buffer *buffers,
+ grub_uint64_t nstripes, grub_uint64_t csize)
+{
+ grub_uint64_t i;
grub_uint64_t i = 0;
int first = 1;
+ int first;
+
+ i = 0;
Then you can drop this assignment.
I prefer to tie the assignment to the place where it is used.
Post by Daniel Kiper
+ while (buffers[i].data_is_valid && i < nstripes)
+ ++i;
+
+ if (i == nstripes)
+ {
+ grub_dprintf ("btrfs", "called rebuild_raid5(), but all disks are OK\n");
+ return;
+ }
+
+ grub_dprintf ("btrfs", "rebuilding RAID 5 stripe #%" PRIuGRUB_UINT64_T "\n",
+ i);
This can be in one line.
+ first = 1;
And you can drop this assignment too.
+ for (i = 0; i < nstripes; i++)
+ {
+ if (!buffers[i].data_is_valid)
+ continue;
+
+ if (first)
+ grub_memcpy(dest, buffers[i].buf, csize);
+ else
+ grub_crypto_xor (dest, dest, buffers[i].buf, csize);
I am not sure why at first you use grub_memcpy() and
then move to grub_crypto_xor(). Could you explain this?
Why do not use grub_crypto_xor() in all cases?
This avoid to require that dest has to be initialized to zero.
Post by Daniel Kiper
+
+ first = 0;
+ }
+}
+
+static grub_err_t
+raid56_read_retry (struct grub_btrfs_data *data,
+ struct grub_btrfs_chunk_item *chunk,
+ grub_uint64_t stripe_offset,
+ grub_uint64_t csize, void *buf)
+{
+
Please drop this empty line. I have asked about that earlier.
+ struct raid56_buffer *buffers = NULL;
+ grub_uint64_t nstripes = grub_le_to_cpu16 (chunk->nstripes);
+ grub_uint64_t chunk_type = grub_le_to_cpu64 (chunk->type);
+ grub_err_t ret = GRUB_ERR_NONE;
+ grub_uint64_t i, failed_devices;
+
+ buffers = grub_zalloc (sizeof(*buffers) * nstripes);
How often this function is called? Maybe you should consider
doing memory allocation for this function only once and free
it at btrfs module unload.
This is only needed in case of recovery. Which should happen no too often. Usually, this function is never called.
Post by Daniel Kiper
+ if (!buffers)
+ {
+ ret = GRUB_ERR_OUT_OF_MEMORY;
+ goto cleanup;
+ }
+
+ for (i = 0; i < nstripes; i++)
+ {
+ buffers[i].buf = grub_zalloc (csize);
Ditto.
+ if (!buffers[i].buf)
+ {
+ ret = GRUB_ERR_OUT_OF_MEMORY;
+ goto cleanup;
+ }
+ }
+
+ for (i = 0; i < nstripes; i++)
+ {
+ struct grub_btrfs_chunk_stripe *stripe;
+ grub_disk_addr_t paddr;
+ grub_device_t dev;
+ grub_err_t err2;
+
+ stripe = (struct grub_btrfs_chunk_stripe *) (chunk + 1);
+ stripe += i;
+
+ paddr = grub_le_to_cpu64 (stripe->offset) + stripe_offset;
+ grub_dprintf ("btrfs", "reading paddr %" PRIxGRUB_UINT64_T
+ " from stripe ID %" PRIxGRUB_UINT64_T "\n", paddr,
+ stripe->device_id);
+
+ dev = find_device (data, stripe->device_id);
+ if (!dev)
+ {
+ buffers[i].data_is_valid = 0;
+ grub_dprintf ("btrfs", "stripe %" PRIuGRUB_UINT64_T " FAILED (dev ID %"
+ PRIxGRUB_UINT64_T ")\n", i, stripe->device_id);
+ continue;
+ }
+
+ err2 = grub_disk_read (dev->disk, paddr >> GRUB_DISK_SECTOR_BITS,
+ paddr & (GRUB_DISK_SECTOR_SIZE - 1),
+ csize, buffers[i].buf);
+ if (err2 == GRUB_ERR_NONE)
+ {
+ buffers[i].data_is_valid = 1;
+ grub_dprintf ("btrfs", "stripe %" PRIuGRUB_UINT64_T " Ok (dev ID %"
+ PRIxGRUB_UINT64_T ")\n", i, stripe->device_id);
+ }
+ else
+ {
+ buffers[i].data_is_valid = 0;
+ grub_dprintf ("btrfs", "stripe %" PRIuGRUB_UINT64_T
+ " FAILED (dev ID %" PRIxGRUB_UINT64_T ")\n", i,
+ stripe->device_id);
+ }
+ }
+
+ failed_devices = 0;
+ for (i = 0; i < nstripes; i++)
for (failed_devices = i = 0; i < nstripes; i++)
Nice
Post by Daniel Kiper
+ if (!buffers[i].data_is_valid)
+ ++failed_devices;
+ if (failed_devices > 1 && (chunk_type & GRUB_BTRFS_CHUNK_TYPE_RAID5))
+ {
+ grub_dprintf ("btrfs",
+ "not enough disks for RAID 5: total %" PRIuGRUB_UINT64_T
+ ", missing %" PRIuGRUB_UINT64_T "\n",
+ nstripes, failed_devices);
+ ret = GRUB_ERR_READ_ERROR;
+ goto cleanup;
+ }
+ else
+ {
+ grub_dprintf ("btrfs",
+ "enough disks for RAID 5 rebuilding: total %"
+ PRIuGRUB_UINT64_T ", missing %" PRIuGRUB_UINT64_T "\n",
+ nstripes, failed_devices);
+ }
+
+ /* if these are enough, try to rebuild the data */
+ if (chunk_type & GRUB_BTRFS_CHUNK_TYPE_RAID5)
+ rebuild_raid5 (buf, buffers, nstripes, csize);
+ else
+ grub_dprintf ("btrfs", "called rebuild_raid6(), NOT IMPLEMENTED\n");
+
Space before the label please.
I have asked about earlier.
The line before the label is already a space; Am I missing something ?
Post by Daniel Kiper
+
Please drop this empty line.
Daniel
--
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5
Goffredo Baroncelli
2018-06-18 18:12:22 UTC
Permalink
Post by Goffredo Baroncelli
Post by Daniel Kiper
+
Space before the label please.
I have asked about earlier.
The line before the label is already a space; Am I missing something
I think that now I understand: are you requiring a space before the label *on the same line* ?
I found this style in some grub code, but this is not used everywhere (ntfs.c and nilfs dont have the space, but xfs has it ).
--
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5
Daniel Kiper
2018-09-03 12:32:44 UTC
Permalink
Post by Goffredo Baroncelli
Post by Goffredo Baroncelli
Post by Daniel Kiper
+
Space before the label please.
I have asked about earlier.
The line before the label is already a space; Am I missing something
I think that now I understand: are you requiring a space before the
label *on the same line* ?
Yep!
Post by Goffredo Baroncelli
I found this style in some grub code, but this is not used everywhere
(ntfs.c and nilfs dont have the space, but xfs has it ).
I know but I prefer label with space before it.

Daniel

Daniel Kiper
2018-06-14 13:11:28 UTC
Permalink
Add the RAID 6 recovery, in order to use a RAID 6 filesystem even if some
disks (up to two) are missing. This code use the md RAID 6 code already
present in grub.
---
grub-core/fs/btrfs.c | 50 ++++++++++++++++++++++++++++++++++++++------
1 file changed, 44 insertions(+), 6 deletions(-)
diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index c8f034641..0b6557ce3 100644
--- a/grub-core/fs/btrfs.c
+++ b/grub-core/fs/btrfs.c
@@ -30,6 +30,7 @@
#include <grub/i18n.h>
#include <grub/btrfs.h>
#include <grub/crypto.h>
+#include <grub/diskfilter.h>
GRUB_MOD_LICENSE ("GPLv3+");
@@ -706,11 +707,35 @@ rebuild_raid5 (char *dest, struct raid56_buffer *buffers,
}
}
+static grub_err_t
+raid6_recover_read_buffer (void *data, int disk_nr,
+ grub_uint64_t addr __attribute__ ((unused)),
+ void *dest, grub_size_t size)
+{
+ struct raid56_buffer *buffers = data;
+
+ grub_memcpy(dest, buffers[disk_nr].buf, size);
+
+ GRUB_ERR_READ_ERROR;
+ return grub_errno;
if (!buffers[disk_nr].data_is_valid)
return GRUB_ERR_READ_ERROR;

grub_memcpy(dest, buffers[disk_nr].buf, size);

return GRUB_ERR_NONE;

Daniel
Daniel Kiper
2018-06-14 13:21:48 UTC
Permalink
Hi Goffredo,
Hi All,
the aim of this patches set is to provide support for a BTRFS raid5/6
filesystem in GRUB.
The first patch, implements the basic support for raid5/6. I.e this works when
all the disks are present.
The next 5 patches, are preparatory ones.
The 7th patch implements the raid5 recovery for btrfs (i.e. handling the
disappearing of 1 disk).
The 8th patch makes the code for handling the raid6 recovery more generic.
The last one implements the raid6 recovery for btrfs (i.e. handling the
disappearing up to two disks).
I tested the code in grub-emu, and it works both with all the disks,
and with some disks missing. I checked the crc32 calculated from grub and
from linux and these matched. Finally I checked if the support for md raid6
still works properly, and it does (with all drives and with up to 2 drives
missing)
Comments are welcome.
In general I am happy that you are doing this work. However, I have just
realized that in some cases you are agreeing with my comments and then
you do not incorporate the changes which I was asking for. So, I would
be more happy if you instead of saying OK just do requested changes.
Otherwise you lose your and my time. Hence, I would like ask you to
check carefully all my comments for v4 and v5 (at least), apply all
requested changes with which you agree and then post v6.

Sorry for being blunt.

Daniel
Goffredo Baroncelli
2018-06-14 18:06:08 UTC
Permalink
Post by Daniel Kiper
Hi Goffredo,
Hi All,
the aim of this patches set is to provide support for a BTRFS raid5/6
filesystem in GRUB.
The first patch, implements the basic support for raid5/6. I.e this works when
all the disks are present.
The next 5 patches, are preparatory ones.
The 7th patch implements the raid5 recovery for btrfs (i.e. handling the
disappearing of 1 disk).
The 8th patch makes the code for handling the raid6 recovery more generic.
The last one implements the raid6 recovery for btrfs (i.e. handling the
disappearing up to two disks).
I tested the code in grub-emu, and it works both with all the disks,
and with some disks missing. I checked the crc32 calculated from grub and
from linux and these matched. Finally I checked if the support for md raid6
still works properly, and it does (with all drives and with up to 2 drives
missing)
Comments are welcome.
In general I am happy that you are doing this work. However, I have just
realized that in some cases you are agreeing with my comments and then
you do not incorporate the changes which I was asking for. So, I would
be more happy if you instead of saying OK just do requested changes.
This is not good. I apologize, If this happened it was a my mistake.
When I put OK, this means that I agree with your reviews and I incorporate them.
Now I am reviewing your old comments
Post by Daniel Kiper
Otherwise you lose your and my time. Hence, I would like ask you to
check carefully all my comments for v4 and v5 (at least), apply all
requested changes with which you agree and then post v6.
Sorry for being blunt.
Daniel
--
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5
Goffredo Baroncelli
2018-06-14 18:55:00 UTC
Permalink
Post by Daniel Kiper
---
grub-core/fs/btrfs.c | 70 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 70 insertions(+)
diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
index be195448d..4d418859b 100644
--- a/grub-core/fs/btrfs.c
+++ b/grub-core/fs/btrfs.c
@@ -119,6 +119,8 @@ struct grub_btrfs_chunk_item
#define GRUB_BTRFS_CHUNK_TYPE_RAID1 0x10
#define GRUB_BTRFS_CHUNK_TYPE_DUPLICATED 0x20
#define GRUB_BTRFS_CHUNK_TYPE_RAID10 0x40
+#define GRUB_BTRFS_CHUNK_TYPE_RAID5 0x80
+#define GRUB_BTRFS_CHUNK_TYPE_RAID6 0x100
grub_uint8_t dummy2[0xc];
grub_uint16_t nstripes;
grub_uint16_t nsubstripes;
@@ -764,6 +766,74 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
stripe_offset = low + chunk_stripe_length
* high;
csize = chunk_stripe_length - low;
+ break;
+ }
+ {
+ grub_uint64_t nparities, stripe_nr, high, low;
+
+ redundancy = 1; /* no redundancy for now */
+
+ if (grub_le_to_cpu64 (chunk->type) & GRUB_BTRFS_CHUNK_TYPE_RAID5)
+ {
+ grub_dprintf ("btrfs", "RAID5\n");
+ nparities = 1;
+ }
+ else
+ {
+ grub_dprintf ("btrfs", "RAID6\n");
+ nparities = 2;
+ }
+
+ /*
+ * Below is an example of a RAID 6 layout and the meaning of the
+ * variables. The same applies to RAID 5. The only differences is
+ * that there is only one parity disk instead of two.
+ *
+ * A RAID 6 layout consists of several stripes spread
+ * on the disks, following a layout like the one below
+ *
+ * Disk1 Disk2 Disk3 Ddisk4
Numbering seems confusing to me. I think that it should be
Disk0 Disk1 Disk2 Disk3
+ *
+ * A1 B1 P1 Q1
+ * Q2 A2 B2 P2
+ * P3 Q3 A3 B3
+ * [...]
+ *
+ * Note that the placement of the parities depends on row index.
+ * - stripe_nr is the stripe number not considering the parities
+ * (A1=0, B1=1, A2 = 2, B2 = 3, ...),
Please be consistent. A1 = 0, B1 = 1, A2 = 2, B2 = 3, ...
+ * - high is the row number (0 for A1...Q1, 1 for Q2..P2, ...),
Ditto. Please always use "..." not "..".
+ * - stripen is the column number (or disk number),
AIUI starting from 0. Right? If yes then I think that
drawing above requires disks/columns renumbering.
right
Post by Daniel Kiper
+ * - off is the logical address to read (from the beginning of
+ * the chunk space),
s/chunk space/chunk/?
+ * - chunk_stripe_length is the size of a stripe (typically 64k),
+ * - nstripes is the number of disks,
+ * - low is the offset of the data inside a stripe,
+ * - stripe_offset is the offset from the beginning of the chunk
+ * disks physical address,
I am not sure that I understand. Could clarify this?
- stripe_offset is the offset (in bytes) from the beginning of the chunk portion
stored on disk.

You can think "stripe_offset" as the "row" in the drawing, but measured in bytes.
Post by Daniel Kiper
+ * - csize is the "potential" data to read. It will be reduced to
+ * size if the latter is smaller.
+ */
+ stripe_nr = grub_divmod64 (off, chunk_stripe_length, &low);
OK.
+ /*
+ * stripen is evaluated without considering
+ * the parities (0 for A1, A2, A3... 1 for B1, B2...).
+ */
+ high = grub_divmod64 (stripe_nr, nstripes - nparities, &stripen);
OK.
+ /*
+ * stripen now considers also the parities (0 for A1, 1 for A2,
+ * 2 for A3....). The math is performed modulo number of disks.
+ */
+ grub_divmod64 (high + stripen, nstripes, &stripen);
OK.
+ stripe_offset = low + chunk_stripe_length * high;
Hmmm... I am confused. What does it mean?
Daniel
_______________________________________________
Grub-devel mailing list
https://lists.gnu.org/mailman/listinfo/grub-devel
--
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5
Goffredo Baroncelli
2018-06-19 17:30:52 UTC
Permalink
Post by Goffredo Baroncelli
Post by Daniel Kiper
+ * - stripe_offset is the offset from the beginning of the chunk
+ * disks physical address,
I am not sure that I understand. Could clarify this?
- stripe_offset is the offset (in bytes) from the beginning of the chunk portion
stored on disk.
You can think "stripe_offset" as the "row" in the drawing, but measured in bytes.
After some thoughts, I will update this description as:

- * - stripe_offset is the offset from the beginning of the chunk
- * disks physical address,
+ * - stripe_offset is the disk offset from the beginning
+ * of the disk chunk mapping start,
--
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5
Continue reading on narkive:
Loading...