Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Certain "emoji" are still half-sized #900

Closed
shanselman opened this issue May 19, 2019 · 19 comments · Fixed by #5795
Closed

Certain "emoji" are still half-sized #900

shanselman opened this issue May 19, 2019 · 19 comments · Fixed by #5795
Assignees
Labels
Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Issue-Bug It either shouldn't be doing this or needs an investigation. Priority-2 A description (P2) Product-Terminal The new Windows Terminal. Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release.
Milestone

Comments

@shanselman
Copy link
Member

Environment

Windows build number: 10.0.18899.1000
Windows Terminal version: 0.1.1361.0

Steps to reproduce

Setup a standard PowerLine using FiraCode in WSL(2) and navigate to a git repository that contains edits. Note if you
printf '✏' that the pencil is double the size of the one in the Powerline prompt.

image

Expected behavior

I expect emojis in the prompt to appear at the same size as those later in the same line.

Actual behavior

emoji size appears to be halved? See pic.

@ghost ghost added Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting Needs-Tag-Fix Doesn't match tag requirements labels May 19, 2019
@DHowett-MSFT
Copy link
Contributor

Fascinating! I think we're doing this technically right, but it doesn't turn out looking right.

According to Unicode, \u270E (LOWER RIGHT PENCIL) is not supposed to be an Emoji. It's not in any of the tables . . . so the real issue here is that we here at Microsoft gave it a glyph and made it one.

image

@DHowett-MSFT DHowett-MSFT added Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Issue-Bug It either shouldn't be doing this or needs an investigation. Product-Terminal The new Windows Terminal. and removed Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting labels May 19, 2019
@ghost ghost removed the Needs-Tag-Fix Doesn't match tag requirements label May 19, 2019
@DHowett-MSFT DHowett-MSFT changed the title Bug Report - Emoji size is halved when Emoji appears in PowerLine Prompt Certain "emoji" are still half-sized May 19, 2019
@fcharlie
Copy link
Contributor

fcharlie commented May 20, 2019

🛠 is Also

批注 2019-05-20 202131

@zadjii-msft
Copy link
Member

Seems kinda similar to #455 and maybe #714

@miniksa miniksa self-assigned this May 21, 2019
@zadjii-msft zadjii-msft added this to the Terminal v1.0 milestone Jun 19, 2019
@terrywh
Copy link

terrywh commented Jun 28, 2019

Maybe 🕷 Spider also ? (Store Version v0.2.1715.0)

image

@fcharlie
Copy link
Contributor

This issue may be related to https://github.com/microsoft/terminal/blob/master/src/types/CodepointWidthDetector.cpp. The width obtained by emoji 🛠 (\U0001F6E0) is actually 1.

In this project https://github.com/fumiyas/wcwidth-cjk , the open emoji detection width is 2.

@fcharlie
Copy link
Contributor

////
// https://github.com/microsoft/terminal/blob/734fc1dcc6de4315d4cc91944c5ea83b7b8a7e1a/src/types/CodepointWidthDetector.cpp
#include <iterator>
#include <algorithm>
#include <array>

namespace unicode {
enum class CodepointWidth : uint8_t {
  Narrow,
  Wide,
  Ambiguous, // could be narrow or wide depending on the current codepage and
             // font
  Invalid    // not a valid unicode codepoint
};

// used to store range data in CodepointWidthDetector's internal map
struct UnicodeRange final {
  unsigned int lowerBound;
  unsigned int upperBound;
  CodepointWidth width;
};

static bool operator<(const UnicodeRange &range,
                      const unsigned int searchTerm) {
  return range.upperBound < searchTerm;
}

static constexpr std::array<UnicodeRange, 285> s_wideAndAmbiguousTable{
    // generated from
    // http://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt
    // anything not present here is presumed to be Narrow.
    UnicodeRange{0xa1, 0xa1, CodepointWidth::Ambiguous},
    UnicodeRange{0xa4, 0xa4, CodepointWidth::Ambiguous},
    UnicodeRange{0xa7, 0xa8, CodepointWidth::Ambiguous},
    UnicodeRange{0xaa, 0xaa, CodepointWidth::Ambiguous},
    UnicodeRange{0xad, 0xae, CodepointWidth::Ambiguous},
    UnicodeRange{0xb0, 0xb4, CodepointWidth::Ambiguous},
    UnicodeRange{0xb6, 0xba, CodepointWidth::Ambiguous},
    UnicodeRange{0xbc, 0xbf, CodepointWidth::Ambiguous},
    UnicodeRange{0xc6, 0xc6, CodepointWidth::Ambiguous},
    UnicodeRange{0xd0, 0xd0, CodepointWidth::Ambiguous},
    UnicodeRange{0xd7, 0xd8, CodepointWidth::Ambiguous},
    UnicodeRange{0xde, 0xe1, CodepointWidth::Ambiguous},
    UnicodeRange{0xe6, 0xe6, CodepointWidth::Ambiguous},
    UnicodeRange{0xe8, 0xea, CodepointWidth::Ambiguous},
    UnicodeRange{0xec, 0xed, CodepointWidth::Ambiguous},
    UnicodeRange{0xf0, 0xf0, CodepointWidth::Ambiguous},
    UnicodeRange{0xf2, 0xf3, CodepointWidth::Ambiguous},
    UnicodeRange{0xf7, 0xfa, CodepointWidth::Ambiguous},
    UnicodeRange{0xfc, 0xfc, CodepointWidth::Ambiguous},
    UnicodeRange{0xfe, 0xfe, CodepointWidth::Ambiguous},
    UnicodeRange{0x101, 0x101, CodepointWidth::Ambiguous},
    UnicodeRange{0x111, 0x111, CodepointWidth::Ambiguous},
    UnicodeRange{0x113, 0x113, CodepointWidth::Ambiguous},
    UnicodeRange{0x11b, 0x11b, CodepointWidth::Ambiguous},
    UnicodeRange{0x126, 0x127, CodepointWidth::Ambiguous},
    UnicodeRange{0x12b, 0x12b, CodepointWidth::Ambiguous},
    UnicodeRange{0x131, 0x133, CodepointWidth::Ambiguous},
    UnicodeRange{0x138, 0x138, CodepointWidth::Ambiguous},
    UnicodeRange{0x13f, 0x142, CodepointWidth::Ambiguous},
    UnicodeRange{0x144, 0x144, CodepointWidth::Ambiguous},
    UnicodeRange{0x148, 0x14b, CodepointWidth::Ambiguous},
    UnicodeRange{0x14d, 0x14d, CodepointWidth::Ambiguous},
    UnicodeRange{0x152, 0x153, CodepointWidth::Ambiguous},
    UnicodeRange{0x166, 0x167, CodepointWidth::Ambiguous},
    UnicodeRange{0x16b, 0x16b, CodepointWidth::Ambiguous},
    UnicodeRange{0x1ce, 0x1ce, CodepointWidth::Ambiguous},
    UnicodeRange{0x1d0, 0x1d0, CodepointWidth::Ambiguous},
    UnicodeRange{0x1d2, 0x1d2, CodepointWidth::Ambiguous},
    UnicodeRange{0x1d4, 0x1d4, CodepointWidth::Ambiguous},
    UnicodeRange{0x1d6, 0x1d6, CodepointWidth::Ambiguous},
    UnicodeRange{0x1d8, 0x1d8, CodepointWidth::Ambiguous},
    UnicodeRange{0x1da, 0x1da, CodepointWidth::Ambiguous},
    UnicodeRange{0x1dc, 0x1dc, CodepointWidth::Ambiguous},
    UnicodeRange{0x251, 0x251, CodepointWidth::Ambiguous},
    UnicodeRange{0x261, 0x261, CodepointWidth::Ambiguous},
    UnicodeRange{0x2c4, 0x2c4, CodepointWidth::Ambiguous},
    UnicodeRange{0x2c7, 0x2c7, CodepointWidth::Ambiguous},
    UnicodeRange{0x2c9, 0x2cb, CodepointWidth::Ambiguous},
    UnicodeRange{0x2cd, 0x2cd, CodepointWidth::Ambiguous},
    UnicodeRange{0x2d0, 0x2d0, CodepointWidth::Ambiguous},
    UnicodeRange{0x2d8, 0x2db, CodepointWidth::Ambiguous},
    UnicodeRange{0x2dd, 0x2dd, CodepointWidth::Ambiguous},
    UnicodeRange{0x2df, 0x2df, CodepointWidth::Ambiguous},
    UnicodeRange{0x300, 0x36f, CodepointWidth::Ambiguous},
    UnicodeRange{0x391, 0x3a1, CodepointWidth::Ambiguous},
    UnicodeRange{0x3a3, 0x3a9, CodepointWidth::Ambiguous},
    UnicodeRange{0x3b1, 0x3c1, CodepointWidth::Ambiguous},
    UnicodeRange{0x3c3, 0x3c9, CodepointWidth::Ambiguous},
    UnicodeRange{0x401, 0x401, CodepointWidth::Ambiguous},
    UnicodeRange{0x410, 0x44f, CodepointWidth::Ambiguous},
    UnicodeRange{0x451, 0x451, CodepointWidth::Ambiguous},
    UnicodeRange{0x1100, 0x115f, CodepointWidth::Wide},
    UnicodeRange{0x2010, 0x2010, CodepointWidth::Ambiguous},
    UnicodeRange{0x2013, 0x2016, CodepointWidth::Ambiguous},
    UnicodeRange{0x2018, 0x2019, CodepointWidth::Ambiguous},
    UnicodeRange{0x201c, 0x201d, CodepointWidth::Ambiguous},
    UnicodeRange{0x2020, 0x2022, CodepointWidth::Ambiguous},
    UnicodeRange{0x2024, 0x2027, CodepointWidth::Ambiguous},
    UnicodeRange{0x2030, 0x2030, CodepointWidth::Ambiguous},
    UnicodeRange{0x2032, 0x2033, CodepointWidth::Ambiguous},
    UnicodeRange{0x2035, 0x2035, CodepointWidth::Ambiguous},
    UnicodeRange{0x203b, 0x203b, CodepointWidth::Ambiguous},
    UnicodeRange{0x203e, 0x203e, CodepointWidth::Ambiguous},
    UnicodeRange{0x2074, 0x2074, CodepointWidth::Ambiguous},
    UnicodeRange{0x207f, 0x207f, CodepointWidth::Ambiguous},
    UnicodeRange{0x2081, 0x2084, CodepointWidth::Ambiguous},
    UnicodeRange{0x20ac, 0x20ac, CodepointWidth::Ambiguous},
    UnicodeRange{0x2103, 0x2103, CodepointWidth::Ambiguous},
    UnicodeRange{0x2105, 0x2105, CodepointWidth::Ambiguous},
    UnicodeRange{0x2109, 0x2109, CodepointWidth::Ambiguous},
    UnicodeRange{0x2113, 0x2113, CodepointWidth::Ambiguous},
    UnicodeRange{0x2116, 0x2116, CodepointWidth::Ambiguous},
    UnicodeRange{0x2121, 0x2122, CodepointWidth::Ambiguous},
    UnicodeRange{0x2126, 0x2126, CodepointWidth::Ambiguous},
    UnicodeRange{0x212b, 0x212b, CodepointWidth::Ambiguous},
    UnicodeRange{0x2153, 0x2154, CodepointWidth::Ambiguous},
    UnicodeRange{0x215b, 0x215e, CodepointWidth::Ambiguous},
    UnicodeRange{0x2160, 0x216b, CodepointWidth::Ambiguous},
    UnicodeRange{0x2170, 0x2179, CodepointWidth::Ambiguous},
    UnicodeRange{0x2189, 0x2189, CodepointWidth::Ambiguous},
    UnicodeRange{0x2190, 0x2199, CodepointWidth::Ambiguous},
    UnicodeRange{0x21b8, 0x21b9, CodepointWidth::Ambiguous},
    UnicodeRange{0x21d2, 0x21d2, CodepointWidth::Ambiguous},
    UnicodeRange{0x21d4, 0x21d4, CodepointWidth::Ambiguous},
    UnicodeRange{0x21e7, 0x21e7, CodepointWidth::Ambiguous},
    UnicodeRange{0x2200, 0x2200, CodepointWidth::Ambiguous},
    UnicodeRange{0x2202, 0x2203, CodepointWidth::Ambiguous},
    UnicodeRange{0x2207, 0x2208, CodepointWidth::Ambiguous},
    UnicodeRange{0x220b, 0x220b, CodepointWidth::Ambiguous},
    UnicodeRange{0x220f, 0x220f, CodepointWidth::Ambiguous},
    UnicodeRange{0x2211, 0x2211, CodepointWidth::Ambiguous},
    UnicodeRange{0x2215, 0x2215, CodepointWidth::Ambiguous},
    UnicodeRange{0x221a, 0x221a, CodepointWidth::Ambiguous},
    UnicodeRange{0x221d, 0x2220, CodepointWidth::Ambiguous},
    UnicodeRange{0x2223, 0x2223, CodepointWidth::Ambiguous},
    UnicodeRange{0x2225, 0x2225, CodepointWidth::Ambiguous},
    UnicodeRange{0x2227, 0x222c, CodepointWidth::Ambiguous},
    UnicodeRange{0x222e, 0x222e, CodepointWidth::Ambiguous},
    UnicodeRange{0x2234, 0x2237, CodepointWidth::Ambiguous},
    UnicodeRange{0x223c, 0x223d, CodepointWidth::Ambiguous},
    UnicodeRange{0x2248, 0x2248, CodepointWidth::Ambiguous},
    UnicodeRange{0x224c, 0x224c, CodepointWidth::Ambiguous},
    UnicodeRange{0x2252, 0x2252, CodepointWidth::Ambiguous},
    UnicodeRange{0x2260, 0x2261, CodepointWidth::Ambiguous},
    UnicodeRange{0x2264, 0x2267, CodepointWidth::Ambiguous},
    UnicodeRange{0x226a, 0x226b, CodepointWidth::Ambiguous},
    UnicodeRange{0x226e, 0x226f, CodepointWidth::Ambiguous},
    UnicodeRange{0x2282, 0x2283, CodepointWidth::Ambiguous},
    UnicodeRange{0x2286, 0x2287, CodepointWidth::Ambiguous},
    UnicodeRange{0x2295, 0x2295, CodepointWidth::Ambiguous},
    UnicodeRange{0x2299, 0x2299, CodepointWidth::Ambiguous},
    UnicodeRange{0x22a5, 0x22a5, CodepointWidth::Ambiguous},
    UnicodeRange{0x22bf, 0x22bf, CodepointWidth::Ambiguous},
    UnicodeRange{0x2312, 0x2312, CodepointWidth::Ambiguous},
    UnicodeRange{0x231a, 0x231b, CodepointWidth::Wide},
    UnicodeRange{0x2329, 0x232a, CodepointWidth::Wide},
    UnicodeRange{0x23e9, 0x23ec, CodepointWidth::Wide},
    UnicodeRange{0x23f0, 0x23f0, CodepointWidth::Wide},
    UnicodeRange{0x23f3, 0x23f3, CodepointWidth::Wide},
    UnicodeRange{0x2460, 0x24e9, CodepointWidth::Ambiguous},
    UnicodeRange{0x24eb, 0x254b, CodepointWidth::Ambiguous},
    UnicodeRange{0x2550, 0x2573, CodepointWidth::Ambiguous},
    UnicodeRange{0x2580, 0x258f, CodepointWidth::Ambiguous},
    UnicodeRange{0x2592, 0x2595, CodepointWidth::Ambiguous},
    UnicodeRange{0x25a0, 0x25a1, CodepointWidth::Ambiguous},
    UnicodeRange{0x25a3, 0x25a9, CodepointWidth::Ambiguous},
    UnicodeRange{0x25b2, 0x25b3, CodepointWidth::Ambiguous},
    UnicodeRange{0x25b6, 0x25b7, CodepointWidth::Ambiguous},
    UnicodeRange{0x25bc, 0x25bd, CodepointWidth::Ambiguous},
    UnicodeRange{0x25c0, 0x25c1, CodepointWidth::Ambiguous},
    UnicodeRange{0x25c6, 0x25c8, CodepointWidth::Ambiguous},
    UnicodeRange{0x25cb, 0x25cb, CodepointWidth::Ambiguous},
    UnicodeRange{0x25ce, 0x25d1, CodepointWidth::Ambiguous},
    UnicodeRange{0x25e2, 0x25e5, CodepointWidth::Ambiguous},
    UnicodeRange{0x25ef, 0x25ef, CodepointWidth::Ambiguous},
    UnicodeRange{0x25fd, 0x25fe, CodepointWidth::Wide},
    UnicodeRange{0x2605, 0x2606, CodepointWidth::Ambiguous},
    UnicodeRange{0x2609, 0x2609, CodepointWidth::Ambiguous},
    UnicodeRange{0x260e, 0x260f, CodepointWidth::Ambiguous},
    UnicodeRange{0x2614, 0x2615, CodepointWidth::Wide},
    UnicodeRange{0x261c, 0x261c, CodepointWidth::Ambiguous},
    UnicodeRange{0x261e, 0x261e, CodepointWidth::Ambiguous},
    UnicodeRange{0x2640, 0x2640, CodepointWidth::Ambiguous},
    UnicodeRange{0x2642, 0x2642, CodepointWidth::Ambiguous},
    UnicodeRange{0x2648, 0x2653, CodepointWidth::Wide},
    UnicodeRange{0x2660, 0x2661, CodepointWidth::Ambiguous},
    UnicodeRange{0x2663, 0x2665, CodepointWidth::Ambiguous},
    UnicodeRange{0x2667, 0x266a, CodepointWidth::Ambiguous},
    UnicodeRange{0x266c, 0x266d, CodepointWidth::Ambiguous},
    UnicodeRange{0x266f, 0x266f, CodepointWidth::Ambiguous},
    UnicodeRange{0x267f, 0x267f, CodepointWidth::Wide},
    UnicodeRange{0x2693, 0x2693, CodepointWidth::Wide},
    UnicodeRange{0x269e, 0x269f, CodepointWidth::Ambiguous},
    UnicodeRange{0x26a1, 0x26a1, CodepointWidth::Wide},
    UnicodeRange{0x26aa, 0x26ab, CodepointWidth::Wide},
    UnicodeRange{0x26bd, 0x26be, CodepointWidth::Wide},
    UnicodeRange{0x26bf, 0x26bf, CodepointWidth::Ambiguous},
    UnicodeRange{0x26c4, 0x26c5, CodepointWidth::Wide},
    UnicodeRange{0x26c6, 0x26cd, CodepointWidth::Ambiguous},
    UnicodeRange{0x26ce, 0x26ce, CodepointWidth::Wide},
    UnicodeRange{0x26cf, 0x26d3, CodepointWidth::Ambiguous},
    UnicodeRange{0x26d4, 0x26d4, CodepointWidth::Wide},
    UnicodeRange{0x26d5, 0x26e1, CodepointWidth::Ambiguous},
    UnicodeRange{0x26e3, 0x26e3, CodepointWidth::Ambiguous},
    UnicodeRange{0x26e8, 0x26e9, CodepointWidth::Ambiguous},
    UnicodeRange{0x26ea, 0x26ea, CodepointWidth::Wide},
    UnicodeRange{0x26eb, 0x26f1, CodepointWidth::Ambiguous},
    UnicodeRange{0x26f2, 0x26f3, CodepointWidth::Wide},
    UnicodeRange{0x26f4, 0x26f4, CodepointWidth::Ambiguous},
    UnicodeRange{0x26f5, 0x26f5, CodepointWidth::Wide},
    UnicodeRange{0x26f6, 0x26f9, CodepointWidth::Ambiguous},
    UnicodeRange{0x26fa, 0x26fa, CodepointWidth::Wide},
    UnicodeRange{0x26fb, 0x26fc, CodepointWidth::Ambiguous},
    UnicodeRange{0x26fd, 0x26fd, CodepointWidth::Wide},
    UnicodeRange{0x26fe, 0x26ff, CodepointWidth::Ambiguous},
    UnicodeRange{0x2705, 0x2705, CodepointWidth::Wide},
    UnicodeRange{0x270a, 0x270b, CodepointWidth::Wide},
    UnicodeRange{0x2728, 0x2728, CodepointWidth::Wide},
    UnicodeRange{0x273d, 0x273d, CodepointWidth::Ambiguous},
    UnicodeRange{0x274c, 0x274c, CodepointWidth::Wide},
    UnicodeRange{0x274e, 0x274e, CodepointWidth::Wide},
    UnicodeRange{0x2753, 0x2755, CodepointWidth::Wide},
    UnicodeRange{0x2757, 0x2757, CodepointWidth::Wide},
    UnicodeRange{0x2776, 0x277f, CodepointWidth::Ambiguous},
    UnicodeRange{0x2795, 0x2797, CodepointWidth::Wide},
    UnicodeRange{0x27b0, 0x27b0, CodepointWidth::Wide},
    UnicodeRange{0x27bf, 0x27bf, CodepointWidth::Wide},
    UnicodeRange{0x2b1b, 0x2b1c, CodepointWidth::Wide},
    UnicodeRange{0x2b50, 0x2b50, CodepointWidth::Wide},
    UnicodeRange{0x2b55, 0x2b55, CodepointWidth::Wide},
    UnicodeRange{0x2b56, 0x2b59, CodepointWidth::Ambiguous},
    UnicodeRange{0x2e80, 0x2e99, CodepointWidth::Wide},
    UnicodeRange{0x2e9b, 0x2ef3, CodepointWidth::Wide},
    UnicodeRange{0x2f00, 0x2fd5, CodepointWidth::Wide},
    UnicodeRange{0x2ff0, 0x2ffb, CodepointWidth::Wide},
    UnicodeRange{0x3000, 0x303e, CodepointWidth::Wide},
    UnicodeRange{0x3041, 0x3096, CodepointWidth::Wide},
    UnicodeRange{0x3099, 0x30ff, CodepointWidth::Wide},
    UnicodeRange{0x3105, 0x312e, CodepointWidth::Wide},
    UnicodeRange{0x3131, 0x318e, CodepointWidth::Wide},
    UnicodeRange{0x3190, 0x31ba, CodepointWidth::Wide},
    UnicodeRange{0x31c0, 0x31e3, CodepointWidth::Wide},
    UnicodeRange{0x31f0, 0x321e, CodepointWidth::Wide},
    UnicodeRange{0x3220, 0x3247, CodepointWidth::Wide},
    UnicodeRange{0x3248, 0x324f, CodepointWidth::Ambiguous},
    UnicodeRange{0x3250, 0x32fe, CodepointWidth::Wide},
    UnicodeRange{0x3300, 0x4dbf, CodepointWidth::Wide},
    UnicodeRange{0x4e00, 0xa48c, CodepointWidth::Wide},
    UnicodeRange{0xa490, 0xa4c6, CodepointWidth::Wide},
    UnicodeRange{0xa960, 0xa97c, CodepointWidth::Wide},
    UnicodeRange{0xac00, 0xd7a3, CodepointWidth::Wide},
    UnicodeRange{0xe000, 0xf8ff, CodepointWidth::Ambiguous},
    UnicodeRange{0xf900, 0xfaff, CodepointWidth::Wide},
    UnicodeRange{0xfe00, 0xfe0f, CodepointWidth::Ambiguous},
    UnicodeRange{0xfe10, 0xfe19, CodepointWidth::Wide},
    UnicodeRange{0xfe30, 0xfe52, CodepointWidth::Wide},
    UnicodeRange{0xfe54, 0xfe66, CodepointWidth::Wide},
    UnicodeRange{0xfe68, 0xfe6b, CodepointWidth::Wide},
    UnicodeRange{0xff01, 0xff60, CodepointWidth::Wide},
    UnicodeRange{0xffe0, 0xffe6, CodepointWidth::Wide},
    UnicodeRange{0xfffd, 0xfffd, CodepointWidth::Ambiguous},
    UnicodeRange{0x16fe0, 0x16fe1, CodepointWidth::Wide},
    UnicodeRange{0x17000, 0x187ec, CodepointWidth::Wide},
    UnicodeRange{0x18800, 0x18af2, CodepointWidth::Wide},
    UnicodeRange{0x1b000, 0x1b11e, CodepointWidth::Wide},
    UnicodeRange{0x1b170, 0x1b2fb, CodepointWidth::Wide},
    UnicodeRange{0x1f004, 0x1f004, CodepointWidth::Wide},
    UnicodeRange{0x1f0cf, 0x1f0cf, CodepointWidth::Wide},
    UnicodeRange{0x1f100, 0x1f10a, CodepointWidth::Ambiguous},
    UnicodeRange{0x1f110, 0x1f12d, CodepointWidth::Ambiguous},
    UnicodeRange{0x1f130, 0x1f169, CodepointWidth::Ambiguous},
    UnicodeRange{0x1f170, 0x1f18d, CodepointWidth::Ambiguous},
    UnicodeRange{0x1f18e, 0x1f18e, CodepointWidth::Wide},
    UnicodeRange{0x1f18f, 0x1f190, CodepointWidth::Ambiguous},
    UnicodeRange{0x1f191, 0x1f19a, CodepointWidth::Wide},
    UnicodeRange{0x1f19b, 0x1f1ac, CodepointWidth::Ambiguous},
    UnicodeRange{0x1f200, 0x1f202, CodepointWidth::Wide},
    UnicodeRange{0x1f210, 0x1f23b, CodepointWidth::Wide},
    UnicodeRange{0x1f240, 0x1f248, CodepointWidth::Wide},
    UnicodeRange{0x1f250, 0x1f251, CodepointWidth::Wide},
    UnicodeRange{0x1f260, 0x1f265, CodepointWidth::Wide},
    UnicodeRange{0x1f300, 0x1f320, CodepointWidth::Wide},
    UnicodeRange{0x1f32d, 0x1f335, CodepointWidth::Wide},
    UnicodeRange{0x1f337, 0x1f37c, CodepointWidth::Wide},
    UnicodeRange{0x1f37e, 0x1f393, CodepointWidth::Wide},
    UnicodeRange{0x1f3a0, 0x1f3ca, CodepointWidth::Wide},
    UnicodeRange{0x1f3cf, 0x1f3d3, CodepointWidth::Wide},
    UnicodeRange{0x1f3e0, 0x1f3f0, CodepointWidth::Wide},
    UnicodeRange{0x1f3f4, 0x1f3f4, CodepointWidth::Wide},
    UnicodeRange{0x1f3f8, 0x1f43e, CodepointWidth::Wide},
    UnicodeRange{0x1f440, 0x1f440, CodepointWidth::Wide},
    UnicodeRange{0x1f442, 0x1f4fc, CodepointWidth::Wide},
    UnicodeRange{0x1f4ff, 0x1f53d, CodepointWidth::Wide},
    UnicodeRange{0x1f54b, 0x1f54e, CodepointWidth::Wide},
    UnicodeRange{0x1f550, 0x1f567, CodepointWidth::Wide},
    UnicodeRange{0x1f57a, 0x1f57a, CodepointWidth::Wide},
    UnicodeRange{0x1f595, 0x1f596, CodepointWidth::Wide},
    UnicodeRange{0x1f5a4, 0x1f5a4, CodepointWidth::Wide},
    UnicodeRange{0x1f5fb, 0x1f64f, CodepointWidth::Wide},
    UnicodeRange{0x1f680, 0x1f6c5, CodepointWidth::Wide},
    UnicodeRange{0x1f6cc, 0x1f6cc, CodepointWidth::Wide},
    UnicodeRange{0x1f6d0, 0x1f6d2, CodepointWidth::Wide},
    UnicodeRange{0x1f6eb, 0x1f6ec, CodepointWidth::Wide},
    UnicodeRange{0x1f6f4, 0x1f6f8, CodepointWidth::Wide},
    UnicodeRange{0x1f910, 0x1f93e, CodepointWidth::Wide},
    UnicodeRange{0x1f940, 0x1f94c, CodepointWidth::Wide},
    UnicodeRange{0x1f950, 0x1f96b, CodepointWidth::Wide},
    UnicodeRange{0x1f980, 0x1f997, CodepointWidth::Wide},
    UnicodeRange{0x1f9c0, 0x1f9c0, CodepointWidth::Wide},
    UnicodeRange{0x1f9d0, 0x1f9e6, CodepointWidth::Wide},
    UnicodeRange{0x20000, 0x2fffd, CodepointWidth::Wide},
    UnicodeRange{0x30000, 0x3fffd, CodepointWidth::Wide},
    UnicodeRange{0xe0100, 0xe01ef, CodepointWidth::Ambiguous},
    UnicodeRange{0xf0000, 0xffffd, CodepointWidth::Ambiguous},
    UnicodeRange{0x100000, 0x10fffd, CodepointWidth::Ambiguous}};

size_t CalculateWidthInternal(char32_t rune) {
  const auto it = std::lower_bound(s_wideAndAmbiguousTable.begin(),
                                   s_wideAndAmbiguousTable.end(), rune);

  // For characters that are not _in_ the table, lower_bound will return the
  // nearest item that is. We must check its bounds to make sure that our hit
  // was a true hit.
  if (it != s_wideAndAmbiguousTable.end() && rune >= it->lowerBound &&
      rune <= it->upperBound) {
    switch (it->width) {
    case CodepointWidth::Ambiguous:
      return 0;
    case CodepointWidth::Wide:
      return 2;
    case CodepointWidth::Narrow:
      return 1;
    default:
      break;
    }
    return 0;
  }
  return 1;
}

} // namespace unicode

@DHowett-MSFT
Copy link
Contributor

It's... slightly more complicated than that. In addition to the table laid out in UCD EastAsianWidth 12.0, which expressly avoids specifying Emoji, there's the Emoji 12.0 table. That specifies which characters are emoji, but we can't just import that table as-is. It specifies a lot of things that aren't emoji as being emoji.

Like this:

0023          ; Emoji                #  1.1  [1] (#️)       number sign

If we ingest this table as is, we'll look even more wrong than we already are.

@fcharlie
Copy link
Contributor

fcharlie commented Aug 23, 2019

@DHowett-MSFT After testing, I found that I only need to add an emoji_width table and add the missing emoji symbols to this table to get these emoji widths correctly.

(Some emoji display different widths in different fonts.)

The unicode interval to be supplemented is as follows:

struct interval {
  char32_t first;
  char32_t last;
};
constexpr const interval emoji_width[] = {
      {0x2194, 0x2199},
      {0x21A9, 0x21AA},
      {0x231A, 0x231B},
      {0x2328, 0x2328},
      {0x23CF, 0x23CF},
      {0x23E9, 0x23F3},
      {0x23F8, 0x23FA},
      {0x24C2, 0x24C2},
      {0x25AA, 0x25AB},
      {0x25B6, 0x25B6},
      {0x25C0, 0x25C0},
      {0x25FB, 0x25FE},
      // u2600~u27BF // fast check
      {0x2600, 0x27BF},
      //---
      {0x2934, 0x2935},
      {0x2B05, 0x2B07},
      {0x2B1B, 0x2B1C},
      {0x2B50, 0x2B50},
      {0x2B55, 0x2B55},
      {0x3030, 0x3030},
      {0x3297, 0x3297},
      {0x3299, 0x3299},
      // 0x1F004 0x1F0CF ... unicode double width 2
      {0x1F300, 0x1F64F},
      {0x1F680, 0x1F6FF},
      {0x1F900, 0x1F9FF},
  };
\U00000023 width: 1 #
\U0000002A width: 1 *
\U00000030 width: 1 0
\U000032FF width: 2 ㋿
\U000000A9 width: 1 ©
\U000000AE width: 1 ®
\U00002934 width: 2 ⤴
\U0001F600 width: 2 😀
\U0000203C width: 1 ‼
\U00002194 width: 2 ↔
\U00002122 width: 1 ™
\U0001F603 width: 2 😃
\U0001F496 width: 2 💖
\U0001F499 width: 2 💙
\U0001F92A width: 2 🤪
\U00004E2D width: 2 中
\U000025B6 width: 2 ▶
\U00002600 width: 2 ☀
\U000000A1 width: 1 ¡
\U0001F6E0 width: 2 🛠
\U0000231A width: 2 ⌚
\U0000231B width: 2 ⌛
\U00002328 width: 2 ⌨
\U000023CF width: 2 ⏏
\U0000FE6F width: 2 ﹯
\U0001F004 width: 2 🀄
\U0001F0CF width: 2 🃏
\U0000270F width: 2 ✏
\U0001F577 width: 2 🕷

DHowett-MSFT pushed a commit that referenced this issue Oct 15, 2019
From Egmont Koblinger:
> In terminal emulation, apps have to be able to print something and
keep track of the cursor, whereas they by design have no idea of the
font being used. In many terminals the font can also be changed runtime
and it's absolutely not feasible to then rearrange the cells. In some
other cases there is no font at all (e.g. the libvterm headless terminal
emulation library, or a detached screen/tmux), or there are multiple
fonts at once (a screen/tmux attached from multiple graphical
emulators).

> The only way to do that is via some external agreement on the number
of cells, which is typically the Unicode EastAsianWidth, often accessed
via wcwidth(). It's not perfect (changes through Unicode versions, has
ambiguous characters, etc.) but is still the best we have.

> glibc's wcwidth() reports 1 for ambiguous width characters, so the de
facto standard is that in terminals they are narrow.

> If the glyph is wider then the terminal has to figure out what to do.
It could crop it (newer versions of Konsole, as far as I know), overflow
to the right (VTE), shrink it (Kitty I believe does this), etc.

See Also:
https://bugzilla.gnome.org/show_bug.cgi?id=767529
https://gitlab.freedesktop.org/terminal-wg/specifications/issues/9
https://www.unicode.org/reports/tr11/tr11-34.html

Salient point from proposed update to Unicode Standard Annex 11:
> Note: The East_Asian_Width property is not intended for use by modern
terminal emulators without appropriate tailoring on a case-by-case
basis.

Fixes #2066
Fixes #2375 

Related to #900
@JustinGrote
Copy link

@miniksa So definitely all the ZWJ emojis that force a "character" to an emoji render are affected, as mentioned probably because it's jamming two characters into one width, hence why the emoji comes out small. Spider (U+1F577️ U+FE0F) previously being mentioned as an example.

So detecting a vs16 and adjusting the character width based on the number of following characters would do it, is that part of the #2928 PR?

@benc-uk
Copy link

benc-uk commented Nov 22, 2019

This has been driving me crazy, I thought it was something I was doing wrong

For example. the cloud glyph I've added to my prompt, is tiny
image
but in the vs code terminal it's fine
image

@JustinGrote
Copy link

JustinGrote commented Nov 22, 2019

@benc-uk yep, it's because the actual icon is two characters, a "standard" cloud character, and then a ZWJ character which tells the terminal to "force" it to emoji style rather than the character style.

VSCode uses xterm internally and it knows how to interpret that (though it has other spacing issues). Windows Terminal currently only sees it as "one character" in terms of width, so that's why it's half-sized.

@DHowett-MSFT DHowett-MSFT assigned leonMSFT and unassigned miniksa Apr 29, 2020
@ghost ghost added the In-PR This issue has a related PR label May 7, 2020
@ghost ghost closed this as completed in #5795 May 8, 2020
@ghost ghost removed the In-PR This issue has a related PR label May 8, 2020
ghost pushed a commit that referenced this issue May 8, 2020
The table that we refer to in `CodepointWidthDetector.cpp` to determine
whether or not a codepoint should be rendered as Wide vs Narrow was
based off EastAsianWidth[1].  If a codepoint wasn't included in this
table, they're considered Narrow. Many emojis aren't specified in the
EAW list, so this PR supplements our table with emoji codepoints from
emoji-data[2] in order to render most, if not all, emojis as full-width. 

There are certain codepoints I've added to the comments (in case we want
to add them officially to the table in the future) that Microsoft
decided to give an emoji presentation even if it's specified as
Narrow/Ambiguous in the EAW list and are _not_ specified in the Unicode
emoji list. These include all of the Mahjong Tiles block, different
direction pencils (✎✐), different pointing index fingers (☜, ☞) among
others. I have no idea if I've captured all of them, as I don't know of
an easy way to detect which are Microsoft specific emojis.

## Validation Steps Performed
I have looked at so many emojis that I dream emoji.

These screenshots aren't encompassing _all_ emoji but I've tried to grab
a couple from all across the codepoint ranges:

Before:
![before](https://user-images.githubusercontent.com/57155886/81445092-2051a980-912d-11ea-9739-c9f588da407d.png)

After:
![after](https://user-images.githubusercontent.com/57155886/81445107-2778b780-912d-11ea-9615-676c2150e798.png)

[1] http://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt
[2] https://www.unicode.org/Public/13.0.0/ucd/emoji/emoji-data.txt

Closes #900
@ghost ghost added the Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release. label May 8, 2020
DHowett-MSFT pushed a commit that referenced this issue May 8, 2020
The table that we refer to in `CodepointWidthDetector.cpp` to determine
whether or not a codepoint should be rendered as Wide vs Narrow was
based off EastAsianWidth[1].  If a codepoint wasn't included in this
table, they're considered Narrow. Many emojis aren't specified in the
EAW list, so this PR supplements our table with emoji codepoints from
emoji-data[2] in order to render most, if not all, emojis as full-width.

There are certain codepoints I've added to the comments (in case we want
to add them officially to the table in the future) that Microsoft
decided to give an emoji presentation even if it's specified as
Narrow/Ambiguous in the EAW list and are _not_ specified in the Unicode
emoji list. These include all of the Mahjong Tiles block, different
direction pencils (✎✐), different pointing index fingers (☜, ☞) among
others. I have no idea if I've captured all of them, as I don't know of
an easy way to detect which are Microsoft specific emojis.

## Validation Steps Performed
I have looked at so many emojis that I dream emoji.

These screenshots aren't encompassing _all_ emoji but I've tried to grab
a couple from all across the codepoint ranges:

Before:
![before](https://user-images.githubusercontent.com/57155886/81445092-2051a980-912d-11ea-9739-c9f588da407d.png)

After:
![after](https://user-images.githubusercontent.com/57155886/81445107-2778b780-912d-11ea-9615-676c2150e798.png)

[1] http://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt
[2] https://www.unicode.org/Public/13.0.0/ucd/emoji/emoji-data.txt

Closes #900

(cherry picked from commit 7ae3433)
@ghost
Copy link

ghost commented May 13, 2020

🎉This issue was addressed in #5795, which has now been successfully released as Windows Terminal Release Candidate v0.11.1333.0 (1.0rc2).:tada:

Handy links:

DHowett-MSFT pushed a commit that referenced this issue May 16, 2020
This removes all glyphs from the emoji list that do not default to
"emoji presentation" (EPres). It removes all local overrides, but retain
the comments about the emoji we left out that are Microsoft-specific.

This brings us fully in line with the most popular Terminals on OS X,
except that we squash our emoji down to fit in one cell and they let
them hang over the edges and damage other characters. Oh well.

Refs #900, #5914.
DHowett pushed a commit that referenced this issue May 17, 2020
This removes all glyphs from the emoji list that do not default to
"emoji presentation" (EPres). It removes all local overrides, but retains
the comments about the emoji we left out that are Microsoft-specific.

This brings us fully in line with the most popular Terminals on OS X,
except that we squash our emoji down to fit in one cell and they let
them hang over the edges and damage other characters. Oh well.

## Detailed Description of the Pull Request / Additional comments

Late Friday evening, I tested my emoji test file on iTerm2. In so doing, I realized
that @j4james and @leonMSFT were right the entire time in #5914: Emoji
that require `U+FE0F` must not be double-width by default.

I finally banged up a powershell script that parses the UCD and emits a codepoint
width table. Once checked in, this will be definitive.

Refs #900, #5914.
Fixes #5941.
DHowett pushed a commit that referenced this issue May 17, 2020
This removes all glyphs from the emoji list that do not default to
"emoji presentation" (EPres). It removes all local overrides, but retains
the comments about the emoji we left out that are Microsoft-specific.

This brings us fully in line with the most popular Terminals on OS X,
except that we squash our emoji down to fit in one cell and they let
them hang over the edges and damage other characters. Oh well.

## Detailed Description of the Pull Request / Additional comments

Late Friday evening, I tested my emoji test file on iTerm2. In so doing, I realized
that @j4james and @leonMSFT were right the entire time in #5914: Emoji
that require `U+FE0F` must not be double-width by default.

I finally banged up a powershell script that parses the UCD and emits a codepoint
width table. Once checked in, this will be definitive.

Refs #900, #5914.
Fixes #5941.

(cherry picked from commit ba1a298)
@GTMxCode
Copy link

GTMxCode commented May 28, 2020

Hey I was linked here from 4747, I made a post back in late April asking if there was a way I could revert the glyph scaling.. I was pretty hopeful when I saw came across the post again

I followed the link above to the store and downloaded the recent version. (I dont know if it helps or not but I am on the win10 slow ring?). nyway things to note:

pros

  • powerline Glyph rendering issues are gone, no longer are the tips of my prompt mismatched.

cons

  • I dont see any diference in terms of scaling.

I bounce between a few monospaced fonts customized with the devicon/font-awesome/powerline font type packages. At first I thouhgt all I had to do was simply find a sweet spot for the glyphs, and believe me I tried to cheat the system with my own font builds before I gave up and moved on.. Anyway I digress,

Some of the BASIC wingdings and things like the check marks, and other glyphss which maintain similar advances to the main font chars dont appear to be affected all that much.

Heres an example of how extreme the scaling can be is:

https://i.imgur.com/nzdWrK9.png.

moved away towards hyper because of the customization it provides and kind of forgot about win term, until the other day. Its actually such a clean experience, my ONLY gripes are the tabs being massive and the font issue and I guess take my fonts pretty seriously so its something I do care about although I know not everyone does... . I can live without full blown UI customizatiions abd what have you but the display gotta be good.

(and for the record, it is super crisp. I REALLY like what has been done)

(just give me my full width glyphs)

@DHowett
Copy link
Member

DHowett commented May 28, 2020

So, icon fonts occupy codepoints that are reserved in all versions of Unicode and do not have the "wide glyph" or "emoji presentation" flags set. The best we can do going forward is to render them over two cells (spill) and let the righthand cell destroy the right half of the spillover.

Folks who take fonts seriously should attempt to get those glyph width changes made standard 😉 until then, it's an absolute crapshoot. More info at #5095 (comment) (closed).

jelster pushed a commit to jelster/terminal that referenced this issue May 28, 2020
The table that we refer to in `CodepointWidthDetector.cpp` to determine
whether or not a codepoint should be rendered as Wide vs Narrow was
based off EastAsianWidth[1].  If a codepoint wasn't included in this
table, they're considered Narrow. Many emojis aren't specified in the
EAW list, so this PR supplements our table with emoji codepoints from
emoji-data[2] in order to render most, if not all, emojis as full-width. 

There are certain codepoints I've added to the comments (in case we want
to add them officially to the table in the future) that Microsoft
decided to give an emoji presentation even if it's specified as
Narrow/Ambiguous in the EAW list and are _not_ specified in the Unicode
emoji list. These include all of the Mahjong Tiles block, different
direction pencils (✎✐), different pointing index fingers (☜, ☞) among
others. I have no idea if I've captured all of them, as I don't know of
an easy way to detect which are Microsoft specific emojis.

## Validation Steps Performed
I have looked at so many emojis that I dream emoji.

These screenshots aren't encompassing _all_ emoji but I've tried to grab
a couple from all across the codepoint ranges:

Before:
![before](https://user-images.githubusercontent.com/57155886/81445092-2051a980-912d-11ea-9739-c9f588da407d.png)

After:
![after](https://user-images.githubusercontent.com/57155886/81445107-2778b780-912d-11ea-9615-676c2150e798.png)

[1] http://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt
[2] https://www.unicode.org/Public/13.0.0/ucd/emoji/emoji-data.txt

Closes microsoft#900
jelster pushed a commit to jelster/terminal that referenced this issue May 28, 2020
This removes all glyphs from the emoji list that do not default to
"emoji presentation" (EPres). It removes all local overrides, but retains
the comments about the emoji we left out that are Microsoft-specific.

This brings us fully in line with the most popular Terminals on OS X,
except that we squash our emoji down to fit in one cell and they let
them hang over the edges and damage other characters. Oh well.

## Detailed Description of the Pull Request / Additional comments

Late Friday evening, I tested my emoji test file on iTerm2. In so doing, I realized
that @j4james and @leonMSFT were right the entire time in microsoft#5914: Emoji
that require `U+FE0F` must not be double-width by default.

I finally banged up a powershell script that parses the UCD and emits a codepoint
width table. Once checked in, this will be definitive.

Refs microsoft#900, microsoft#5914.
Fixes microsoft#5941.
@lainisourgod
Copy link

I still have the issue with half-ranged emojis on ⚠. Windows Terminal build v1.7.1033.0, WSL 2, neovim.

image

@DHowett
Copy link
Member

DHowett commented Apr 19, 2021

@meandmymind The default representation for ⚠ is "narrow, single-width, non-emoji". The bug here is that it is yellow (emoji presentation), not that it is small.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Issue-Bug It either shouldn't be doing this or needs an investigation. Priority-2 A description (P2) Product-Terminal The new Windows Terminal. Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

15 participants