Friday 11 June 2010

Conversion of UTF-8 form fields in Multi Part Form

Got to say a big thanks to Abhishek at Running Commentary for the solution to this problem.

While using the Apache Commons File Upload Library on a Spanish Language form, I was having problems with the UTF-8 characters that were being submitted in people's names etc. Everything else was working properly, the rest of the Spanish characters on the page were displaying properly but the input was being translated into garbage when submitted.

The solution ultimately was to change

item.getString()

to

item.getString("UTF-8").trim()

While extracting the items from the List element that the FileUploadHandler returned after parsing the request object.

Bear in mind, it doesn't matter what you do before calling this, if you don't explicitely tell the FUH what encoding you want to you use, you'll always get standard ASCII back (and subsequently garbage if your form fields contain anything other than standard ASCII...

Hope this helps.

No comments:

Post a Comment

Please leave your feedback and comments. I love to discuss this stuff!