Skip to content

Commit 8296091

Browse files
fdmananakdave
authored andcommitted
btrfs: fix corrupt read due to bad offset of a compressed extent map
If we attempt to insert a compressed extent map that has a range that overlaps another extent map we have in the inode's extent map tree, we can end up with an incorrect offset after adjusting the new extent map at merge_extent_mapping() because we don't update the extent map's offset. For example consider the following scenario: 1) We have a file extent item for a compressed extent covering the file range [108K, 144K) and currently there's no corresponding extent map in the inode's extent map tree; 2) The inode's size is 141K; 3) We have an encoded write (compressed) into the file range [120K, 128K), which overlaps the existing file extent item. The encoded write creates a matching extent map, adds it to the inode's extent map tree and creates an ordered extent for it. Note that the corresponding file extent item is added to the subvolume tree only when the ordered extent completes (when executing btrfs_finish_one_ordered()); 4) We have a write into the file range [160K, 164K). This writes increases the i_size of the file, and there's a hole between the current i_size (141K) and the start offset of this write, and since the old i_size is in the middle of the block [140K, 144K), we have to write zeroes to the range [141K, 144K) (3072 bytes) and therefore dirty that page. We then call btrfs_set_extent_delalloc() with a start offset of 140K. We then end up at btrfs_find_new_delalloc_bytes() which will call btrfs_get_extent() for the range [140K, 144K); 5) The btrfs_get_extent() doesn't find any extent map in the inode's extent map tree covering the range [140K, 144K), so it searches the subvolume tree for any file extent items covering that range. There it finds the file extent item for the range [108K, 144K), creates a compressed extent map for that range and then calls btrfs_add_extent_mapping() with that extent map and passes the range [140K, 144K) via the "start" and "len" parameters; 6) The call to add_extent_mapping() done by btrfs_add_extent_mapping() fails with -EEXIST because there's an extent map, created at step 2 for the [120K, 128K) range, that covers that overlaps with the range of the given extent map ([108K, 144K)). Then it does a lookup for extent map from step 2 add calls merge_extent_mapping() to adjust the input extent map ([108K, 144K)). That adjust the extent map to a start offset of 128K and a length of 16K (starting just after the extent map from step 2), but it does not update the offset field of the extent map, leaving it with a value of zero instead of updating to a value of 20K (128K - 108K = 20K). As a result any read for the range [128K, 144K) can return incorrect data since we read from a wrong section of the extent (unless both the correct and incorrect ranges happen to have the same data). So fix this by changing merge_extent_mapping() to update the extent map's offset even if it's compressed. Also add a test case to the self tests. This didn't happen before the patchset that does big changes in the extent map structure (which includes the commit in the Fixes tag below) because we kept track of the original start offset in the extent map (member "orig_start") so we could always calculate the correct offset by subtracting that offset from the start offset. A test case for fstests that triggered this problem using send/receive with compressed writes will be added soon. Fixes: 3d2ac99 ("btrfs: introduce new members for extent_map") Reviewed-by: Qu Wenruo <[email protected]> Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]>
1 parent 757dd69 commit 8296091

File tree

2 files changed

+100
-1
lines changed

2 files changed

+100
-1
lines changed

Diff for: fs/btrfs/extent_map.c

+1-1
Original file line numberDiff line numberDiff line change
@@ -664,7 +664,7 @@ static noinline int merge_extent_mapping(struct btrfs_inode *inode,
664664
start_diff = start - em->start;
665665
em->start = start;
666666
em->len = end - start;
667-
if (em->disk_bytenr < EXTENT_MAP_LAST_BYTE && !extent_map_is_compressed(em))
667+
if (em->disk_bytenr < EXTENT_MAP_LAST_BYTE)
668668
em->offset += start_diff;
669669
return add_extent_mapping(inode, em, 0);
670670
}

Diff for: fs/btrfs/tests/extent-map-tests.c

+99
Original file line numberDiff line numberDiff line change
@@ -900,6 +900,102 @@ static int test_case_7(struct btrfs_fs_info *fs_info, struct btrfs_inode *inode)
900900
return ret;
901901
}
902902

903+
/*
904+
* Test a regression for compressed extent map adjustment when we attempt to
905+
* add an extent map that is partially overlapped by another existing extent
906+
* map. The resulting extent map offset was left unchanged despite having
907+
* incremented its start offset.
908+
*/
909+
static int test_case_8(struct btrfs_fs_info *fs_info, struct btrfs_inode *inode)
910+
{
911+
struct extent_map_tree *em_tree = &inode->extent_tree;
912+
struct extent_map *em;
913+
int ret;
914+
int ret2;
915+
916+
em = alloc_extent_map();
917+
if (!em) {
918+
test_std_err(TEST_ALLOC_EXTENT_MAP);
919+
return -ENOMEM;
920+
}
921+
922+
/* Compressed extent for the file range [120K, 128K). */
923+
em->start = SZ_1K * 120;
924+
em->len = SZ_8K;
925+
em->disk_num_bytes = SZ_4K;
926+
em->ram_bytes = SZ_8K;
927+
em->flags |= EXTENT_FLAG_COMPRESS_ZLIB;
928+
write_lock(&em_tree->lock);
929+
ret = btrfs_add_extent_mapping(inode, &em, em->start, em->len);
930+
write_unlock(&em_tree->lock);
931+
free_extent_map(em);
932+
if (ret < 0) {
933+
test_err("couldn't add extent map for range [120K, 128K)");
934+
goto out;
935+
}
936+
937+
em = alloc_extent_map();
938+
if (!em) {
939+
test_std_err(TEST_ALLOC_EXTENT_MAP);
940+
ret = -ENOMEM;
941+
goto out;
942+
}
943+
944+
/*
945+
* Compressed extent for the file range [108K, 144K), which overlaps
946+
* with the [120K, 128K) we previously inserted.
947+
*/
948+
em->start = SZ_1K * 108;
949+
em->len = SZ_1K * 36;
950+
em->disk_num_bytes = SZ_4K;
951+
em->ram_bytes = SZ_1K * 36;
952+
em->flags |= EXTENT_FLAG_COMPRESS_ZLIB;
953+
954+
/*
955+
* Try to add the extent map but with a search range of [140K, 144K),
956+
* this should succeed and adjust the extent map to the range
957+
* [128K, 144K), with a length of 16K and an offset of 20K.
958+
*
959+
* This simulates a scenario where in the subvolume tree of an inode we
960+
* have a compressed file extent item for the range [108K, 144K) and we
961+
* have an overlapping compressed extent map for the range [120K, 128K),
962+
* which was created by an encoded write, but its ordered extent was not
963+
* yet completed, so the subvolume tree doesn't have yet the file extent
964+
* item for that range - we only have the extent map in the inode's
965+
* extent map tree.
966+
*/
967+
write_lock(&em_tree->lock);
968+
ret = btrfs_add_extent_mapping(inode, &em, SZ_1K * 140, SZ_4K);
969+
write_unlock(&em_tree->lock);
970+
free_extent_map(em);
971+
if (ret < 0) {
972+
test_err("couldn't add extent map for range [108K, 144K)");
973+
goto out;
974+
}
975+
976+
if (em->start != SZ_128K) {
977+
test_err("unexpected extent map start %llu (should be 128K)", em->start);
978+
ret = -EINVAL;
979+
goto out;
980+
}
981+
if (em->len != SZ_16K) {
982+
test_err("unexpected extent map length %llu (should be 16K)", em->len);
983+
ret = -EINVAL;
984+
goto out;
985+
}
986+
if (em->offset != SZ_1K * 20) {
987+
test_err("unexpected extent map offset %llu (should be 20K)", em->offset);
988+
ret = -EINVAL;
989+
goto out;
990+
}
991+
out:
992+
ret2 = free_extent_map_tree(inode);
993+
if (ret == 0)
994+
ret = ret2;
995+
996+
return ret;
997+
}
998+
903999
struct rmap_test_vector {
9041000
u64 raid_type;
9051001
u64 physical_start;
@@ -1076,6 +1172,9 @@ int btrfs_test_extent_map(void)
10761172
if (ret)
10771173
goto out;
10781174
ret = test_case_7(fs_info, BTRFS_I(inode));
1175+
if (ret)
1176+
goto out;
1177+
ret = test_case_8(fs_info, BTRFS_I(inode));
10791178
if (ret)
10801179
goto out;
10811180

0 commit comments

Comments
 (0)