[Feature] I need solutions for re-unicode Chinese character and adjust floating point precision #2922

SMFDrummer · 2025-02-06T17:41:34Z

What happened?

My original data is like

{
    "sd": {
        "n": "\u81ea\u4fe1\u7684\u5927\u5634\u82b1",
        "lmz": 1.000000,
        "rsd": {
            "wr": 0.300000,
            "lr": 0.300000,
            "bc": 0
        }
    }
}

When I use parseToJsonElement and encodeToString, this json string changed like

{
    "sd": {
        "n": "自信的大嘴花",
        "lmz": 1.0,
        "rsd": {
            "wr": 0.3,
            "lr": 0.3,
            "bc": 0
        }
    }
}

I'd like to

The reason why I not make this json to a @serializable data class is because this json is so mass, thousands of lines. I literally cannot handle this. I have to use JsonElement as JsonObject instead. So how to keep the unicode non-convert, and how to make the float value maintain 6 digit precision. Please give me some tips.

SMFDrummer · 2025-02-07T04:20:55Z

I've tested for KSerializer

import kotlinx.serialization.KSerializer
import kotlinx.serialization.Serializable
import kotlinx.serialization.descriptors.PrimitiveKind
import kotlinx.serialization.descriptors.PrimitiveSerialDescriptor
import kotlinx.serialization.descriptors.SerialDescriptor
import kotlinx.serialization.encodeToString
import kotlinx.serialization.encoding.Decoder
import kotlinx.serialization.encoding.Encoder
import kotlinx.serialization.json.Json
import org.jetbrains.annotations.TestOnly

@Serializable
data class Data(
    val rs: Int,
    @Serializable(with = UnicodeStringSerializer::class)
    val n: String
)

object UnicodeStringSerializer : KSerializer<String> {
    override val descriptor: SerialDescriptor = PrimitiveSerialDescriptor("UnicodeString", PrimitiveKind.STRING)
    override fun deserialize(decoder: Decoder): String {
        val decoded = decoder.decodeString()
        return decoded.split("\\u")
            .filter { it.isNotEmpty() }
            .joinToString("") {
                it.toInt(16).toChar().toString()
            }
    }

    override fun serialize(encoder: Encoder, value: String) {
        val unicodeEscaped = value.toCharArray().joinToString("") {
            "\\u" + it.code.toString(16).padStart(4, '0')
        }
        encoder.encodeString(unicodeEscaped)
    }
}

@TestOnly
fun main() {
    val data = Data(rs = 1703239116, n = "自信的大嘴花")
    val json = Json { encodeDefaults = true }
    val jsonString = json.encodeToString(data)
    println(jsonString)
}

then I get result like

{"rs":1703239116,"n":"\\u81ea\\u4fe1\\u7684\\u5927\\u5634\\u82b1"}

but I want is

{"rs":1703239116,"n":"\u81ea\u4fe1\u7684\u5927\u5634\u82b1"}

is kotlin cannot handle this? Please give me some advice.

sandwwraith · 2025-02-07T13:52:04Z

We do not have any setting to control output formatting, unfortunately. You can try to use JsonUnquotedLiteral together with some JsonTransformer: https://github.com/Kotlin/kotlinx.serialization/blob/master/docs/json.md#encoding-literal-json-content-experimental

SMFDrummer · 2025-02-08T09:04:39Z

DAMN Thank YOU AAAAAAAAAAAAA
I used a third-party lib: @nomisRev: kotlinx-serialization-jsonpath, drived by Arrow. This library is so sick.(Apology for at)

fun String.addQuotes(): String = "\"$this\""
fun String.ensureAscii(): String = this.toCharArray().joinToString("") { "\\u" + it.code.toString(16).padStart(4, '0') }

val origin = """{"sd":{"n":"\u81ea\u4fe1\u7684\u5927\u5634\u82b1","lmz":1.000000,"rsd":{"wr":0.300000,"lr":0.300000,"bc":0}}}"""

val data = Json.parse(origin) as JsonObject // {"sd":{"n":"自信的大嘴花","lmz":1.0,"rsd":{"wr":0.3,"lr":0.3,"bc":0}}}
// I can modify others here, and do something else..., and then ->
val data1 = JsonPath.path("sd.n").modify(data) {
    JsonUnquotedLiteral(it.jsonPrimitive.content.ensureAscii().escaped().addQuotes())
} // {"sd":{"n":"\u81ea\u4fe1\u7684\u5927\u5634\u82b1","lmz":1.0,"rsd":{"wr":0.3,"lr":0.3,"bc":0}}}

val data2 = JsonPath.path("sd.lmz").modify(data1) {
    JsonUnquotedLiteral(it.jsonPrimitive.content.toBigDecimal().setScale(6).toString())
} // {"sd":{"n":"\u81ea\u4fe1\u7684\u5927\u5634\u82b1","lmz":1.000000,"rsd":{"wr":0.3,"lr":0.3,"bc":0}}}

val data3 = JsonPath.path("sd.rsd.wr").modify(data2) {
    JsonUnquotedLiteral(it.jsonPrimitive.content.toBigDecimal().setScale(6).toString())
} // {"sd":{"n":"\u81ea\u4fe1\u7684\u5927\u5634\u82b1","lmz":1.000000,"rsd":{"wr":0.300000,"lr":0.3,"bc":0}}}

val data4 = JsonPath.path("sd.rsd.lr").modify(data3) {
    JsonUnquotedLiteral(it.jsonPrimitive.content.toBigDecimal().setScale(6).toString())
} // {"sd":{"n":"\u81ea\u4fe1\u7684\u5927\u5634\u82b1","lmz":1.000000,"rsd":{"wr":0.300000,"lr":0.300000,"bc":0}}}

println(Json.encodeToString(data4) == origin) // true

Although this issue has been resolved, as I am not very familiar with the Arrow library and am new to using the JsonPath library, I do not know how to combine them together to avoid declaring so many intermediate variables. I apologize for taking up your time. As a beginner, I would like to ask you how to reduce intermediate variables?(My code is not very standardized, please forgive me)

sandwwraith · 2025-02-14T17:25:36Z

Unfortunately, I'm not familiar with JsonPath or Arrow so I can't help you here

SMFDrummer · 2025-05-14T08:48:05Z

Unfortunately, I'm not familiar with JsonPath or Arrow so I can't help you here

Hi I'm back, you and other kotliner could watch this issue to modify JsonElement: nomisRev/kotlinx-serialization-jsonpath#69

SMFDrummer added the feature label Feb 6, 2025

sandwwraith added question and removed feature labels Feb 7, 2025

SMFDrummer mentioned this issue Feb 11, 2025

[Question] How do I modify json continuously? nomisRev/kotlinx-serialization-jsonpath#69

Open

sandwwraith closed this as completed Feb 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] I need solutions for re-unicode Chinese character and adjust floating point precision #2922

[Feature] I need solutions for re-unicode Chinese character and adjust floating point precision #2922

SMFDrummer commented Feb 6, 2025

SMFDrummer commented Feb 7, 2025

Uh oh!

sandwwraith commented Feb 7, 2025

Uh oh!

SMFDrummer commented Feb 8, 2025

Uh oh!

sandwwraith commented Feb 14, 2025

Uh oh!

SMFDrummer commented May 14, 2025

Uh oh!

[Feature] I need solutions for re-unicode Chinese character and adjust floating point precision #2922

[Feature] I need solutions for re-unicode Chinese character and adjust floating point precision #2922

Comments

SMFDrummer commented Feb 6, 2025

SMFDrummer commented Feb 7, 2025

Uh oh!

sandwwraith commented Feb 7, 2025

Uh oh!

SMFDrummer commented Feb 8, 2025

Uh oh!

sandwwraith commented Feb 14, 2025

Uh oh!

SMFDrummer commented May 14, 2025

Uh oh!