2 Examples To Convert Byte[] Array To String Inward Java

Converting a byte array to String seems slow but what is hard is, doing it correctly. Many programmers brand error of ignoring grapheme encoding whenever bytes are converted into a String or char or vice versa. As a programmer, nosotros all know that computer's alone empathise binary information i.e. 0 too 1. All things nosotros run across too purpose e.g. images, text files, movies, or whatsoever other multi-media is stored inwards shape of bytes, but what is to a greater extent than of import is procedure of encoding or decoding bytes to character. Data conversion is an of import topic on whatsoever programming interview, too because of trickiness of grapheme encoding, this questions is i of the most popular String Interview question on Java Interviews. While reading a String from input rootage e.g. XML files, HTTP request, network port, or database, y'all must pay attending on which grapheme encoding (e.g. UTF-8, UTF-16, too ISO 8859-1) they are encoded. If y'all volition non purpose the same grapheme encoding spell converting bytes to String, y'all would halt upwardly amongst a corrupt String which may incorporate totally wrong values. You mightiness receive got seen ?, foursquare brackets after converting byte[] to String, those are because of values your electrical flow grapheme encoding is non supporting, too merely showing approximately garbage values.

I tried to empathise why programmes brand grapheme encoding mistakes to a greater extent than oftentimes than not, too my piddling enquiry too ain sense suggests that, it may hold upwardly because of 2 reasons, start non dealing plenty amongst internationalization too grapheme encodings too minute because ASCII characters are supported past times almost all pop encoding schemes too has same values.  Since nosotros mostly bargain amongst encoding similar UTF-8Cp1252 and Windows-1252, which displays ASCII characters (mostly alphabets too numbers) without fail, fifty-fifty if y'all purpose different encoding scheme. Real upshot comes when your text contains special characters e.g. 'é', which is oftentimes used inwards French names. If your platform's grapheme encoding doesn't recognize that grapheme therefore either y'all volition run across a different grapheme or something garbage, too sadly until y'all got your hands burned, y'all are unlikely to hold upwardly careful amongst grapheme encoding. In Java, things are piddling fleck to a greater extent than tricky because many IO classes e.g. InputStreamReader by default purpose platform's grapheme encoding. What this agency is that, if y'all run your plan inwards different machine, y'all volition probable acquire different output because of different grapheme encoding used on that machine. In this article, nosotros volition larn how to convert byte[] to String inwards Java both past times using JDK API too amongst the aid of Guava too Apache commons.




How to convert byte[] to String inwards Java

There are multiple ways to modify byte array to String inwards Java, y'all tin either purpose methods from JDK, or y'all tin purpose opened upwardly rootage costless APIs similar Apache common too Google Guava. These API provides at to the lowest degree 2 sets of methods to do String shape byte array;  one, which uses default platform encoding too other which takes grapheme encoding. You should ever purpose after one, don't rely on platform encoding. I know, it could hold upwardly same or y'all mightiness non receive got faced whatsoever occupation therefore far, but it's ameliorate to hold upwardly rubber than sorry. As I pointed out inwards my concluding post nigh printing byte array equally Hex String, It's too i of the best practise to specify grapheme encoding spell converting bytes to grapheme inwards whatsoever programming language. It mightiness hold upwardly possible that your byte array incorporate non-printable ASCII characters. Let's start run across JDK's way of converting byte[] to String :

1) You tin purpose constructor of String, which takes byte array too grapheme encoding

String str = new String(bytes, "UTF-8");

This is the right way to convert bytes to String, provided y'all know for sure that bytes are encoded inwards the grapheme encoding y'all are using.

2) If y'all are reading byte array from whatsoever text file e.g. XML document, HTML file or binary file, y'all tin purpose the Apache Commons IO library to convert the FileInputStream to a String directly. This method too buffers the input internally, therefore in that place is no need to purpose approximately other BufferedInputStream.

String fromStream = IOUtils.toString(fileInputStream, "UTF-8");

In fellowship to correctly convert those byte array into String, y'all must start  discover right grapheme encoding past times reading meta information e.g. Content-Type<?xml encoding="…"> etc, depending on the format/protocol of the information y'all are reading. This is i of the argue I recommend to purpose XML parsers e.g. SAX or DOM parsers to read XML files, they receive got tending of grapheme encoding past times themselves.

Some programmers, too recommends to purpose Charset over String for specifying grapheme encoding,  e.g. instead of "UTF-8" purpose StandardCharsets.UTF_8 mainly to avoid UnsupportedEncodingException inwards worst case. There are half dozen criterion Charset implementations guaranteed to hold upwardly supported past times all Java platform implementations. You tin purpose them instead specifying encoding scheme inwards String. In short, ever prefer StandardCharsets.ISO_8859_1 over "ISO_8859_1", equally shown below :

String str = IOUtils.toString(fis,StandardCharsets.UTF_8);

Other criterion charset supported past times Java platform are :

  1. StandardCharsets.ISO_8859_1
  2. StandardCharsets.US_ASCII
  3. StandardCharsets.UTF_16
  4. StandardCharsets.UTF_16BE
  5. StandardCharsets.UTF_16LE


If y'all are reading bytes from input stream, y'all tin too banking concern fit my before post nigh 5 ways to convert InputStream to String inwards Java for details.

Original XML
Here is our sample XML snippet to demonstrate issues amongst using default grapheme encoding. This file contains letter 'é'which is non correctly displayed inwards Eclipse because it's default grapheme encoding is Cp1252.

xml version="1.0" encoding="UTF-8"?> <banks>     <bank>         <name>Industrial & Commercial Bank of PRC </name>         <headquarters> Beijing , China</headquarters>     </bank>     <bank>         <name>Crédit Agricole SA</name>         <headquarters>Montrouge, France</headquarters>     </bank>     <bank>         <name>Société Générale</name>         <headquarters>Paris, Île-de-France, France</headquarters>     </bank> </banks>

And, this is what happens when y'all convert a byte array to String without specify grapheme encoding, e.g. :

String str = new String(filedata);

This volition purpose platform's default grapheme encoding, which is Cp1252 in this case, because nosotros are running this plan inwards Eclipse IDE. You tin run across that letter 'é' is non displayed correctly.

xml version="1.0" encoding="UTF-8"?> <banks>     <bank>         <name>Industrial & Commercial Bank of PRC </name>         <headquarters> Beijing , China</headquarters>     </bank>     <bank>         <name>Crédit Agricole SA</name>         <headquarters>Montrouge, France</headquarters>     </bank>     <bank>         <name>Société Générale</name>         <headquarters>Paris, Île-de-France, France</headquarters>     </bank> </banks>


To cook this, specify grapheme encoding spell creating String from byte array, e.g.

String str = new String(filedata, "UTF-8");

By the way, allow me arrive clear that fifty-fifty though I receive got read XML files using InputStream hither it's non a expert practice, inwards fact it's a bad practice. You should ever purpose proper XML parsers for reading XML documents. If y'all don't know how, delight banking concern fit this tutorial. Since this illustration is mostly to exhibit y'all why grapheme encoding matters, I receive got chosen an illustration which was easily available too looks to a greater extent than practical.


Java Program to Convert Byte array to String inwards Java

 Converting a byte array to String seems slow but what is hard is 2 Examples to Convert Byte[]  Array to String inwards Java
Here is our sample plan to exhibit why relying on default grapheme encoding is a bad thought too why y'all must purpose grapheme encoding spell converting byte array to String inwards Java. In this program, nosotros are using Apache Commons IOUtils flat to direct read file into byte array. It takes tending of opening/closing input stream, therefore y'all don't need to worry nigh leaking file descriptors. Now how y'all do String using that array, is the key. If y'all supply right grapheme encoding, y'all volition acquire right output otherwise a nearly right but wrong output.

import java.io.FileInputStream; import java.io.IOException; import org.apache.commons.io.IOUtils;  /**  * Java Program to convert byte array to String. In this example, nosotros receive got start  * read an XML file amongst grapheme encoding "UTF-8" into byte array too therefore created  * String from that. When y'all don't specify a grapheme encoding, Java uses  * platform's default encoding, which may non hold upwardly the same if file is a XML document coming from approximately other system, emails, or manifestly text files fetched from an * HTTP server etc. You must start discovery right grapheme encoding  * too therefore purpose them spell converting byte array to String.  *  * @author Javin Paul  */ public class ByteArrayToString{          public static void main(String args[]) throws IOException  {             System.out.println("Platform Encoding : " + System.getProperty("file.encoding"));                            FileInputStream fis = new FileInputStream("info.xml");                       // Using Apache Commons IOUtils to read file into byte array            byte[] filedata = IOUtils.toByteArray(fis);                            String str = new String(filedata, "UTF-8");            System.out.println(str);                                         } }  Output : Platform Encoding : Cp1252 <?xml version="1.0" encoding="UTF-8"?> <banks>     <bank>         <name>Industrial & Commercial Bank of China </name>         <headquarters> Beijing , China</headquarters>     </bank>     <bank>         <name>Crédit Agricole SA</name>         <headquarters>Montrouge, France</headquarters>     </bank>     <bank>         <name>Société Générale</name>         <headquarters>Paris, Île-de-France, France</headquarters>     </bank> </banks>


Things to yell back too Best Practices

Always remember, using grapheme encoding spell converting byte array to String is non a best practise but mandatory thing. You should ever purpose it irrespective of programming language. By the way, y'all tin receive got banking concern annotation of next things, which volition aid y'all to avoid twain of nasty issues :

  • Use grapheme encoding from the rootage e.g. Content-Type inwards HTML files, or <?xml encoding="…">.
  • Use XML parsers to parse XML files instead of finding grapheme encoding too reading it via InputStream, approximately things are best left for demo code only. 
  • Prefer Charset constants e.g. StandardCharsets.UTF_16 instead of String "UTF-16"
  • Never rely on platform's default encoding scheme

This rules should too hold upwardly applied when y'all convert grapheme information to byte e.g. converting String to byte array using String.getBytes() method. In this instance it volition purpose platform's default grapheme encoding, instead of this y'all should purpose overloaded version which takes grapheme encoding.

That's all on how to convert byte array to String inwards Java. As y'all tin run across that Java API, specially java.lang.String flat provides methods too constructor that takes a byte[] too returns a String (or vice versa), but past times default they rely on platform's grapheme encoding, which may non hold upwardly correct, if byte array is created from XML files, HTTP asking information or from network protocols. You should ever acquire right encoding from rootage itself. If y'all similar to read to a greater extent than nigh what every programmer should know nigh String, y'all tin checkout this article.

Further Learning
Data Structures too Algorithms: Deep Dive Using Java
Algorithms too Data Structures - Part 1 too 2
Data Structures inwards Java nine past times Heinz Kabutz




Sumber https://javarevisited.blogspot.com/

0 Response to "2 Examples To Convert Byte[] Array To String Inward Java"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel