-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Expose convertEncoding convenience function to controller #13772
base: main
Are you sure you want to change the base?
[WIP] Expose convertEncoding convenience function to controller #13772
Conversation
Allow to convert from UTF-8 to whatever encoding the device supports
src/controllers/controller.h
Outdated
class Charsets : public QObject { | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to expose some common charsets as constants to JS, maybe moving convertEncoding
to that new object too so that from JS, it would look like: engine.charsets.convertEncoding(engine.charsets.WELL_KNOWN_CHARSETS.LATIN_9)
, for instance. But I still havent figured out how to do that. I feel like I may be trying to bit off more that I can chew here, provided my C++ skills…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think making it part of the engine class is unnecessary. you can simply add it to the global object, it requires a little bit of boilerplate but not much. See ControllerScriptEngineBase::initialize()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh you mean have charsets
exposed in the global namespace? Aye. I could do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To add an enum, simply add it to the Charsets
class and make it a Q_ENUM
. Seems fairly doable IMO, I'm sure you can figure it out. If not, feel free to ask.
I think design-wise, it would make most sense to just return a QByteArray (which is automatically a JS TypedArray), since if you're converting into a different format, you'll likely want to send out the bytes soon after without manipulating them more (at least in terms of characters). Also putting it back into a QString would either convert it back to UTF-16 or create a (likely invalid, UB causing) string. |
Isn't what I'm doing here?
I don't understand… Where am I doing that? |
whoops, yeah, that assumption was based on the |
src/controllers/controller.h
Outdated
icu::UnicodeString unicodeString; | ||
UErrorCode errorCode; | ||
UConverter* latin9Converter = ucnv_open(targetCharset.toLocal8Bit().data(), &errorCode); | ||
|
||
if (!U_FAILURE(errorCode)) { | ||
ucnv_close(latin9Converter); | ||
return QJSValue::UndefinedValue; | ||
} | ||
|
||
char* result = nullptr; | ||
ucnv_fromUChars(latin9Converter, | ||
result, | ||
0, | ||
&value.data()->unicode(), | ||
value.length(), | ||
&errorCode); | ||
ucnv_close(latin9Converter); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Qt has a wrapper for all this. I think it should be:
icu::UnicodeString unicodeString; | |
UErrorCode errorCode; | |
UConverter* latin9Converter = ucnv_open(targetCharset.toLocal8Bit().data(), &errorCode); | |
if (!U_FAILURE(errorCode)) { | |
ucnv_close(latin9Converter); | |
return QJSValue::UndefinedValue; | |
} | |
char* result = nullptr; | |
ucnv_fromUChars(latin9Converter, | |
result, | |
0, | |
&value.data()->unicode(), | |
value.length(), | |
&errorCode); | |
ucnv_close(latin9Converter); | |
QTextCodec* codec = QTextCodec::codecForName(targetCharset); | |
if (!codec) { | |
return QJSValue::UndefinedValue; | |
} | |
QByteArray result = codec->fromUnicode(value); |
(Not tested)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately this nice API was replaced in Qt6 by QStringConverter, which supports much less codecs. The old API is still available in the "Qt 5 Core Compat module" for Qt6, but I guess this is nothing that should be used in new code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmm interesting.
In Qt6 there is a common base class for QStringConverterBase
for QStringConverter
and QTextCodec
https://github.com/qt/qtbase/blob/30d90b4ccad83ab1f23dab7cd72b7e228c299895/src/corelib/text/qstringconverter.cpp#L1772
And it is also using ucnv under the hood:
https://github.com/qt/qtbase/blob/30d90b4ccad83ab1f23dab7cd72b7e228c299895/src/corelib/text/qstringconverter.cpp#L2085
icu::UnicodeString unicodeString; | |
UErrorCode errorCode; | |
UConverter* latin9Converter = ucnv_open(targetCharset.toLocal8Bit().data(), &errorCode); | |
if (!U_FAILURE(errorCode)) { | |
ucnv_close(latin9Converter); | |
return QJSValue::UndefinedValue; | |
} | |
char* result = nullptr; | |
ucnv_fromUChars(latin9Converter, | |
result, | |
0, | |
&value.data()->unicode(), | |
value.length(), | |
&errorCode); | |
ucnv_close(latin9Converter); | |
QStringEncoder encoder = QStringEncoder(targetCharset); | |
if (!encoder.isValid()) { | |
return QJSValue::UndefinedValue; | |
} | |
QByteArray result = encoder.encode(value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our QT vcpkg is build with "icu"
https://github.com/daschuer/vcpkg/blob/5d58718e04ee9196314dc77d7cf414f36be24b65/ports/qtbase/vcpkg.json#L62
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daschuer thank you for pointing this up. If I can't use QTextCodec
, there's definitely some code I can copy here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if I'm not mistaken, this should look like on my last commit. This, however doesn't work as expected. From JS side, convertEncoding
returns an object that, printed using console.log
, display the original string. Not a TypedArray
. I can't figure out what's happening. I'm leaving it ofr tonight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thats weird. It should work. https://doc.qt.io/qt-6/qtqml-cppintegration-data.html#qbytearray-to-javascript-arraybuffer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe do some type introspection (instanceof
) to confirm if it is an ArrayBuffer
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah you're right. I expected a TypedArray
and thus a .forEach()
. So I'd need to do new Uint8Array(midi.convertEncoding("ISO-8859-15", "Thing to display"))
from JS? Is there a way to directly return a TypedArray
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can technically execute some JS code in C++ that essentially does that and return the resulting QJSValue, but thats quite ugly IMO. I think it should up to the caller how they want to interpret that buffer. If thats too much typing for you, you can always create a utility function in your mapping, but lets not bake that into the API. It will cause API inconsistencies and unnecessary overhead.
899d84d
to
b730583
Compare
src/controllers/controller.h
Outdated
// The length parameter is here for backwards compatibility for when scripts | ||
// were required to specify it. | ||
Q_INVOKABLE virtual void send(const QList<int>& data, unsigned int length = 0) { | ||
Q_UNUSED(length); | ||
m_pController->send(data, data.length()); | ||
} | ||
|
||
// Available charsets should be available here: | ||
// http://www.iana.org/assignments/character-sets/character-sets.xhtml | ||
Q_INVOKABLE QByteArray convertEncoding(const QString& targetCharset, const QString& value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I can expose well known charsets as an enum to QJSEngine
but C++ enum only accepts ints as value. Is there a way that I can use strings as value or modify the signature of this function to accepts either a QString or an enum?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can take a QJSValue (which acts as a QVariant) and then switch internally depending on the type of the contained value. I'm not sure if that complexity is worth it though. I would prefer to have an enum exposed. A string would make more sense if the set of encoding is open, but in this case its closed because we're limited by QStringConverter::Encoding
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh no, since 6.4, the set is opened with QStringConverter
. That's why my last commit uses QTextCodec
prior to 6.4. Virtually any encoding supported by UCI should be available. See my last comment, I used Latin-9 which is not available in QStringConverter::Encoding
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, there is probably a little more we can do here. Apparently there are more optional codecs available using QStringConverter::availableCodecs()
which we might want to support. In this case we should probably mirror the QStringEncoder
api a little and have a function with the enum and a function with the string. I wouldn't try to emulate that via the QJSValue overload stuff though because the manual dispatch is annoying. Instead I would create two different Q_INVOKEABLE
methods that are implemented in terms of the correct QTextEncoder
overload. I'd all add a another Q_VERSION_CHECK
, so we can skip the UTF16 to UTF8 conversion for > Qt 6.8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh no, since 6.4, the set is opened with
QStringConverter
. That's why my last commit usesQTextCodec
prior to 6.4. Virtually any encoding supported by UCI should be available. See my last comment, I used Latin-9 which is not available inQStringConverter::Encoding
.
Yeah, you're right. I didn't see that comment when I wrote the last one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nit picks.
src/controllers/controller.h
Outdated
@@ -16,7 +18,7 @@ class Controller : public QObject { | |||
Q_OBJECT | |||
public: | |||
explicit Controller(const QString& deviceName); | |||
~Controller() override; // Subclass should call close() at minimum. | |||
~Controller() override; // Subclass should call close() at minimum. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note, no need for a change: Make sure you editor does not reformat the whole file. That adds only noise to PRs and may conflict unnecessarily with other PRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm… This is a pre-commit
fix, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, but pre-commit only touches the lines in the diff.
src/controllers/controller.h
Outdated
@@ -193,13 +196,33 @@ class ControllerJSProxy : public QObject { | |||
: m_pController(m_pController) { | |||
} | |||
|
|||
QHash<QString, QString> m_cache; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are in the controller class this cache needs a more significant name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What, is this unused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah no, sorry. This was left from a previous experimentation. This must be removed.
src/controllers/controller.h
Outdated
// http://www.iana.org/assignments/character-sets/character-sets.xhtml | ||
Q_INVOKABLE QByteArray convertEncoding(const QString& targetCharset, const QString& value) { | ||
#if QT_VERSION < QT_VERSION_CHECK(6, 4, 0) | ||
auto* codec = QTextCodec::codecForName(targetCharset.toUtf8()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please prefix pointers with p: pCodec
src/controllers/controller.h
Outdated
// The length parameter is here for backwards compatibility for when scripts | ||
// were required to specify it. | ||
Q_INVOKABLE virtual void send(const QList<int>& data, unsigned int length = 0) { | ||
Q_UNUSED(length); | ||
m_pController->send(data, data.length()); | ||
} | ||
|
||
// Available charsets should be available here: | ||
// http://www.iana.org/assignments/character-sets/character-sets.xhtml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the supported charsets depend on the Qt build options and the ICU version, we should only allow charsets from a positive list.
The charsets on this positive list can than be probed in the CMake configuration step, to ensure that nobody builds a Mixxx version that don't support this functionality of the mapping API:
# Add a custom command to compile the code
add_custom_command(
OUTPUT list_codecs
COMMAND ${CMAKE_CXX_COMPILER} -o list_codecs -xc++ - <<EOF
#include <QStringConverter>
#include <QStringList>
#include <QTextStream>
int main(int argc, char* argv[]) {
if (argc != 2) {
return 1; // Invalid number of arguments
}
QString charset = argv[1];
QStringList codecs = QStringConverter::availableCodecs();
if (codecs.contains(charset)) {
return 0; // Charset is available
} else {
return 1; // Charset is not available
}
}
EOF
WORKING_DIRECTORY ${CMAKE_BINARY_DIR}
COMMENT "Compiling list_codecs"
)
# List of specified charsets to check
set(SPECIFIED_CHARSETS "UTF-8;ISO-8859-1;ISO-8859-15")
# Check for each specified charset
foreach(CHARSET IN LISTS SPECIFIED_CHARSETS)
execute_process(
COMMAND ${CMAKE_BINARY_DIR}/list_codecs ${CHARSET}
WORKING_DIRECTORY ${CMAKE_BINARY_DIR}
RESULT_VARIABLE RETURN_CODE
)
if(${RETURN_CODE} EQUAL 0)
message(STATUS "Charset ${CHARSET} is available.")
else()
message(FATAL_ERROR "Charset ${CHARSET} is not available.")
endif()
endforeach()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alternatively we just let the mapping handle it. Not sure whats better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meh. I'm not in favor of that. The idea is to allow controller developers to use charsets if available without having to open a PR to make it available. I can, however check against QStringConverter::availableCodecs()
and QTextCodec::availableCodecs()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every Mixxx installation must support each mapping. The mapping API must be stable accross installations. There is no problem, to add a long list of charsets to the positive list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if packagers are very happy about this though. Letting the mapping handle it (disable the corresponding feature) may make more sense. It entirely depends on the list.
src/controllers/controller.h
Outdated
@@ -16,7 +18,7 @@ class Controller : public QObject { | |||
Q_OBJECT | |||
public: | |||
explicit Controller(const QString& deviceName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this functionality should not be in the controller class itself, which contains the protocol specific controller IO functions.
Either add it to the engine API src/controllers/scripting/legacy/controllerscriptinterfacelegacy.h
or add an own charSetMapper class like src/controllers/scripting/colormapperjsproxy.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense since this is computing data that it supposed to be send through midi. Isn't it adding complexity to add yet another interface in the global object; in particular an interface with only one function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The engine
class is already there, and used for nearly every function in the mapping API other than raw IO. There is no overhead, just a different existing class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes less sense to me but I won't fight on this and move to the engine
object.
Should I expose |
No, the API must guarantee to support the same charsets on all platforms and builds of Mixxx. |
That'll limit us to a somewhat small common subset though. Is that a price we're willing to pay? Do we want to tell a contributor "no we can not accept your mapping because one niche feature doesn't work on one particular platform"? |
I don't think so, if you use a Qt build with the same ICU version, you will get always the same charsets. We just need to ensure, that the configuration of the Mixxx build fails, if someone uses an inproper Qt build. |
Plus, it's not as if it poses the risk of a crash. At worst, the function would return |
b730583
to
531a483
Compare
* @param {string} targetCharset The charset to encode the string into. | ||
* @param {string} value The string to encode | ||
* @returns {ArrayBuffer | undefined}The converted String as an array of bytes or undefined if an error happened when performing conversion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe jsdoc/require-param-type
and jsdoc/require-returns-type
should be disabled since we're docmenting tpyes using JS.
531a483
to
659e759
Compare
#else | ||
QStringEncoder fromUtf8 = QStringEncoder(targetCharset.toUtf8().data()); | ||
if (!fromUtf8.isValid()) { | ||
return nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, that doesn't translate into undefined
. Am I really forced to use QJSValue as a return type?
ISO_10646_UCS_2 | ||
|
||
}; | ||
Q_ENUM(WellKnownCharsets) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dammit! I can't find how to expose this enum under engine.WellKnownCharsets
. Can someone help me?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to tell the JavaScript-Interpreter (QJSEngine) about it. Something like this:
void registerEnums(QJSEngine& engine) {
qRegisterMetaType<mixxx::preferences::constants::WellKnownCharsets>("WellKnownCharsets");
const QMetaObject& metaObject = mixxx::preferences::constants::staticMetaObject;
QMetaEnum metaEnum = metaObject.enumerator(metaObject.indexOfEnumerator("WellKnownCharsets"));
QJSValue enumObject = engine.newObject();
for (int enumIdx = 0; enumIdx < metaEnum.keyCount(); ++enumIdx ) {
enumObject.setProperty(metaEnum.key(enumIdx ), metaEnum.value(i));
}
engine.globalObject().setProperty("WellKnownCharsets", enumObject);
}
Allow to convert from UTF-8 to whatever encoding the device supports